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Thermodynamik und die Struktur der Quantentheorie als eine verall- 
gemeinerte probabilistische Theorie: 

Diese Arbeit untersucht den Zusammenhang zwischen Quantentheorie, Ther¬ 
modynamik und Informationstheorie. Es werden Theorien betrachtet, welche 
eine der Quantentheorie ahnliche Struktur besitzen und im “Generalized Prob¬ 
abilistic Theories” genannten Framework beschrieben werden. Einem Vorschlag 
von J. Barrett jf7 folgend wird ein Gedankenexperiment von von Neumann[5] 
adaptiert um eine naturliche thermodynamische Entropie-Definition zu erhal- 
ten. Einige mathematische Eigenschaften dieser Entropie werden physikalischen 
Konsequenzen des Gedankenexperiments gegeniiber gestellt. Die Giiltigkeit des 
zweiten Hauptsatzes der Thermodynamik wird untersucht. In diesem Kontext 
werden auch Observablen und projektive Messungen verallgemeinert, um einen 
Entropie-Zuwachs in projektiven Messungen von Ensembles zu beweisen. Infor- 
mationstheoretisch motivierte Definitionen der Entropie, welche in [23] |2fi] einge- 
fiihrt wurden, werden mit der thermodynamisch motivierten Definition der En¬ 
tropie verglichen. Die Bedingungen fiir die Wohldefiniertheit der Entropie wer¬ 
den genauer analysiert. Es werden einige weitere Eigenschaften der behandelten 
Theorien (z.B. Frage nach Interferenz hoherer Ordnung, Pfisters Zustandsunter- 
scheidungsprinzip m) und deren Zusammenhang mit der Entropie untersucht. 


Thermodynamics and the Structure of Quantum Theory as a Gener¬ 
alized Probabilistic Theory: 

This thesis investigates the connection between quantum theory, thermodynam¬ 
ics and information theory. Theories with structure similar to that of quantum 
theory are considered, mathematically described by the framework of “General¬ 
ized Probabilistic Theories”. For these theories, a thought experiment by von 
Neumann [5] is adapted to obtain a natural thermodynamic entropy definition, 
following a proposal by J. Barrett |7]. Mathematical properties of this entropy 
are compared to physical consequences of the thought experiment. The validity 
of the second law of thermodynamics is investigated. In that context, observ¬ 
ables and projective measurements are generalized to prove an entropy increase 
for projective measurements of ensembles. Information-theoretically motivated 
definitions of the entropy introduced in [23j f2£J are compared to the entropy from 
the thermodynamic thought experiment. The conditions for the thermodynamic 
entropy to be well-defined are considered in greater detail. Several further prop¬ 
erties of the theories under consideration (e.g. whether there is higher order 
interference, Pfister’s state discrimination principle H3I) and their relation to 
entropy are investigated. 
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1 Introduction 

While quantum theory exists for roughly 100 years, it still remains mysterious. 
Many people have pondered about quantum theory, asking for the true reality, hid¬ 
den determinism,... without coming to a real conclusion. Many interpretations 
with identical predictions have appeared 00 , therefore it is not possible to find 
the “right” interpretation by experiment. Despite these fundamental conceptional 
issues, quantum theory is extremely successful in experiment and technology. As 
summarized by Mermin’s famous sentence “shut up and calculate”[T], a large part of 
the scientific community has turned away from the foundations of quantum physics, 
and prefers to apply quantum physics to concrete physical systems. An important 
reason for this decision is that the mathematical formalism provided by quantum 
theory can be used without understanding its origin. Another important reason is, 
that many attempts to think about quantum theory remain very vague or appear 
helpless. Explanations in the style of collapsing electron clouds, pilot waves, in¬ 
finitely many realities,... often seem to miss the point, overcomplicating quantum 
theory without really solving its conceptional problems. Sometimes explanations 
motivated by classical intuition are even in contradiction to results from standard 
quantum theory, especially Bell’s Theorem pf]. 

Thus for a very long time, many people have lost interest in asking complicated 
questions about the foundations of quantum physics that seemingly cannot be an¬ 
swered anyway. 

This has changed when a new approach reached the field of quantum foundations 
[IB]: The rise of quantum information theory showed that it is fruitful to take an 
operational/information-theoretic approach to think about physics. This approach 
is inspired by both relativity and quantum field theory in a very general sense: Rel¬ 
ativity gives the observer a fundamental role in reality, as indicated by the famous 
statement “everything is relative”. Quantum held theory and particle physics care 
very much about producing results like cross sections and correlation functions that 
can be measured in experiment; especially hypothetical particles and fields without 
any interaction are excluded because their existence makes no difference (it is this 
way in which such particles are not real). 

The operational/information-theoretic approach also assumes the point of view “real 
is what can be performed or measured”. Preparations, transformations and mea¬ 
surements are basic notions and are combined into a strict mathematical framework 
known as Generalized Probabilistic Theories (GPTs). Historically, parts of the tools 
and the formalism as well as the idea to reaxiomatize quantum theory come from 
quantum logic [9]|S|. The most important success of the GPT framework is that it 
allows to replace the vague attempts to derive quantum theory by postulates that 
are both mathematically precise and motivated by physical or information theoretic 
ideas. The idea to motivate postulates by considerations about computation and 
information[33] [Q] comes from the close connection of the framework to quantum 
information theory. 
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While such questions are unusual for physicists, they provide many advantages: In¬ 
formation and state processing can be experimentally demonstrated in a laboratory, 
thus the corresponding postulates often can be tested. Furthermore a connection be¬ 
tween information theory and physics is provided by the notion of entropy. The black 
hole information paradox and Landauer’s principle indicate that this link might be 
very deep, as summarized by Landauer’s statement “information is physical ” or 
Wheeler’s “it from bit ” [Bj. Another hint can be found in quantum teleportation: If 
there was no need for classical communication, then the teleportation would happen 
faster than the speed of light (see e.g. [23] )• This implies that information is a 
fundamental part of reality, not just some data irrelevant for the physical processes. 
Thus it might be that the unsatisfying attempts to explain quantum theory might be 
caused by asking the wrong questions, neglecting a possible fundamental importance 
of information and its relevance for observers, measurements and transformations. 

This thesis explores the connection between quantum theory, thermodynamics and 
quantum information theory. Two postulates, one of them motivated by physics and 
the other one by information theory, are used to provide strong structural proper¬ 
ties for theories. For these theories, which include quantum and classical physics 
and many more (e.g. quaternionic quantum theory), an old thought experiment 
by von Neumann [5] is used to derive a von Neumann-like thermodynamic entropy, 
following a suggestion by J. Barrett [7]. Many properties of this entropy are proven. 
The postulates, supplemented by two other postulates, already have been used to 
derive quantum theory (in finite dimensions) fLUj and thus suggest a deep connection 
between quantum theory and thermodynamics. Another point of view is that these 
postulates summarize structural properties of quantum theory relevant for thermo¬ 
dynamics into two well-defined mathematical statements. 

The thesis is structured as follows: At first, some basic definitions and results from 
convex geometry are explained in Chapter [2} These are necessary to introduce 
the mathematical framework called “ Generalized Probabilistic Theories (GPTy 1 in 
Chapter [3j This framework allows to derive quantum theory by using exact math¬ 
ematically well-defined postulates. Afterwards in Chapter [4| the main postulates 
of this thesis are introduced and motivated. Then in Chapter [5j the thought ex¬ 
periment by von Neumann which derives the von Neumann entropy is presented. 
We apply this thought experiment to our GPTs to find a corresponding notion of 
entropy. In Chapter [6j a generalized version of the thought experiment is presented, 
which is more general but less elegant. From this experiment follows an important 
property of the entropy, whose validity is checked for several theories. In Chapter 
[7] we generalize the projective measurements known from quantum theory to our 
GPTs. They can be used to describe the semi-permeable membranes used in the 
thought experiments. Furthermore, it is possible to prove the second law of ther¬ 
modynamics for these measurements, which is done in Chapter [8] Also the second 
law is checked in mixing procedures. As most of our considerations so far have been 
from a thermodynamic point of view, in Chapter [9] we analyze the entropy from 



3 


an information-theoretic/operational point of view. Here, measurement and decom¬ 
position entropies mm will be introduced and compared to the thermodynamic 
entropy. Furthermore, the same is done for the Renyi entropies. We also investigate 
the question whether there is third order interference and relate it to the entropies. 
Then in Chapter [lOl an example for a state space is constructed to show, that our 


first postulate alone is not enough for a well-defined entropy. In Chapter [TTJ we will 
show that a principle called state discrimination principle (introduced by Corsin 
Pfister) holds in all theories considered by us. At last, an outlook will be given in 


Chapter 12 
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2 Convex geometry 

Our first goal is to introduce the framework called Generalized Probabilistic Theo¬ 
ries (GPT). This framework allows to discuss many probabilistic theories, including 
quantum theory and classical probability theory. The basic framework is very natu¬ 
ral, relying only on very weak assumptions. These assumptions lead to convex sets 
and convex-linear maps. Therefore, basic notions and results from convex geometry 
are necessary to understand the GPT-framework. 

As convex geometry usually is not part of the physics curriculum, we will provide 
a short introduction here, ft is mainly based on [T3] and na. In case the reader 
needs more examples and applications, one should take a look at m- 

Convex sets contain all straight connection lines between points taken from these 
sets: 


Definition 2.1. Let V be a real vector space. A subset C C V is called convex if 
for all v,w G C and p G [0,1] also pv + (1 — p)w G C. 

This definition directly extends to more states: 


Proposition 2.2. Let V be a real vector space, C a convex subset. For anyp \, ...,p n > 
0, E jPj = 1, ...,v n G C we find JfjPjVj G C. 


Proof. We prove by induction. For n — 1 , there is nothing to show. So assume now 
the statement to be true for n. Wlog we assume pj > 0 Vj. 

We rewrite: 


n+1 n 


j =1 k =1 


Pj 


n 

7=1 ^a= 1 Jr a 


+ Pn+l v n+l 


( 2 . 1 ) 


By the induction hypothesis, we find: 


E ++1 6 c (2 ’ 2) 

j =1 ^a— 1 irQ> 

By the definition of convex thus Ej=i Pj v j G C. □ 

However, not all points of convex sets are found in the interior of straight lines. The 
counter-examples will play an important role and thus deserve a name: 


Definition 2.3. A point x of a convex set C is called an extreme point of C if 
for all p G (0,1) and v,w G C with x = pv + (1 — p)w we find v = w = x. The set 
of extreme points is called ext(C). 


Examples for convex sets are cubes and balls. The extreme points of a cube are its 
corners, while all surface points of a ball are extreme points. More examples can be 


found in Figure 2.1 
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l/N 

Figure 2.1: The square and the circle are examples for convex sets. For a square, only 
the corners are extreme points, while for the circle, all boundary points are 
extreme points. The third set is not convex: The red line connecting two points 
of the set is not fully contained in the set. 

Next we define faces. Faces are “maximal” planar surface parts, e.g. the sides of a 
cube (see also Figure [272] ) : 

Definition 2.4. A nonempty convex subset F of a convex set C is called a face of 
C, if for all v,w G C and p G (0,1) with pv + (1 — p)w G F we find v G F, w G F . 




Figure 2.2: The faces of a cuboid are its corners, its edges, its rectangles and the cuboid 
itself. 

Lemma 2.5. Let x be an extreme point of a convex set C. Then {x} is a face of 

C. 

Proof, {x} is not empty, x = px + (1 — p)x i.e. {x} is convex, x = pv + (1 — p)w 
for p G (0,1) implies x = v = w because x is extreme. □ 


Lemma 2.6. Let F be a face of a convex set C. Let v\, ...,v n G C and pi, ...,p n > 0 
with Yfij= i Pj = 1 be such that Pj v j £ F. Then for all j, v 3 G F 
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Proof. We prove by induction. For n = 2, the statement follows immediately from 
the definition. So assume now that the statement is true for n states, n > 2. 

We rewrite: 

n+1 n 

Em = E p k 

j =1 k =1 

By definition of a face, 

£ e (2-4) 

j=l xCL 

and v n+ i G F. By the induction hypothesis Vj G F for 1 < j < n. □ 

Corollary 2.7. Lei x be an extreme point of a convex set C and v \, ...,v n G C and 
Pi, ■:,p n > 0 with Yff=iPj = 1 be such that Yfj=iVj v j — x ■ Then v 3 = x Vj. 


Pj 


S Ea=lL« 


+ Pn+lW+l 


(2-3) 


Definition 2.8. Let M be a subset of a real vector space V. The convex hull of 
M, conv(M), and the affine hull of M, aff(M), are defined as 


conv(M) := 


n 


Em 

j=i 


n gN,vj GM, pj > 0 with ^ pj 

j=i 



aff(M) := 


n 


Epj w i 

j=i 


n gN,vj GM, Pj G M with ^pj 

j=i 



(2.5) 

( 2 . 6 ) 


Terms of the form. Yfj=iPj w j with Pj > 0 and Jff =1 Pj = 1 are called convex (lin¬ 
ear) combinations . They are called affine (linear) combinations if pj G M 
with Yff=i Pj = 1- 


Proposition 2.9. If F is a face of a convex set C, then F = aff(F) D C. 

Proof. Trivially, F C aff(F) and F C C, thus F C aff(F) fl C. Much harder to show 
is aff(F) n C C F: 

So let v := ffjPjWj G aff(F) fl C. We relabel such that for j < m, pj > 0 and for 
j > m, Pj < 0. We assume m < n, otherwise v G F as v is given by a convex 
combination in F. Thus: 


v+ Y, \Pj\ w i = J2\P. 

j=m +1 j =1 


J I W j 


Because of Yfj=\Pj — 1, there is at least one pj > 0. Therefore: 


ET=1 1 Pj\ 


E 

j=171+1 


I Pj 


- w 


= E 


I Pj 


Er=iki J U^T=i\Pk 


7 Wj 


(2.7) 


( 2 . 8 ) 
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The expression on the right-hand side is a convex combination of states in F. Thus 
by convexity of F: 


y_hJ_ w . g f 

hzz=M ’ 


(2.9) 


Using 


Er =1 i Pj 


+ E 


I Pj 


j=m -\-1 


Eli Ip* I 


1 + Efc=m+1 bfc 

EfcLi kl 


ELi Pk + ELm+i bfc 

EfeLi \Pk\ 


= l 

( 2 . 10 ) 


we see that also the left-hand side of Equation (2.8) is a convex combination of 
elements in C. As F is a face and v G (7, v e F. □ 


Definition 2.10. For a convex set C, let M C C. The face generated by M is 

defined as the minimal face containing M: 

F~ f) H (2-11) 

HCC face, McH 


Proposition 2.11. For a convex set C , let M C C. The face F generated by M is 
indeed a face. If G is another face of C containing M, then F C G. 

Proof. The last statement is clear by definition of F as intersection of all faces 
containing M. 

So it remains to show, that F is indeed a face: 

F is not empty because C itself is a face which contains M. For { 14 .} C F, {pk} a 
probability distribution, {v*,} is also found in all faces containing M by definition 
of F. As all faces are convex, all these faces also contain PhFk- By definition of 
F as an intersection, also ffkPkPk £ F. Thus F is convex. 

Now let w = pvi+(l—p)v 2 G F with iq, v 2 G C and 0 < p < 1. Every face containing 
M also contains w and thus also V{ , v 2 because they are faces. By definition of F as 
intersection of all these faces, also iq, v 2 G F. □ 

The importance of the extreme points is that the extreme points generate (compact) 
convex sets, as shown by the famous Krein-Milman theorem (see e.g. jTF] Theorem 
VIII.4.4): 

Theorem 2.12 (Krein-Milman). Let V be a locally convex topological vector space 
(Hausdorff) and C a compact convex subset of V. Then 

C = conv(ext(C)) (2.12) 

I 11 finite dimension, there is a simpler version by Minkowski (see e.g. |15| Theorem 

2.6.16, Pj): 










Theorem 2.13. Let V be a real finite-dimensional vector space and C a compact 
convex subset of V. Then 

C = conv(ext(C)) (2.13) 

Note that the finite-dimensional version is much simpler, all the topological difficul¬ 
ties are gone. Later on, we will restrict ourselves to finite dimension in order to not 
obscure the physics by topological technicalities. 

Now we consider maps that preserve the convexity structure. 

Definition 2.14. A map f : V —> W between finite-dimensional vector spaces is 
called convex-linear if for all p G [0,1], x,y G V we have f(px + (1 — p)y) = 

pf{x) + (i -p)f(y)- 

It is called affine-linear if f(px + (1 — p)y) = pf(x) + (1 — p)f(y) for all p G M, 
x,y G V. 


Proposition 2.15. Every convex-linear map is affine-linear. 

Proof. We have to check f(px + (1 — p)y ) = pf(x) + (1 —p)f(y) for i,i/GK,|)GK. 
For p G [0,1] this is clear by convexity. Now assume p > 1: 

Then we have to show f(x) = 1 f(px + (1 — p)y) + ^ffiy). As p > 1 we find 
1 — p < 0. Especially, 0 < 1 < 1 and 0 < - l £^ < 1. By convexity 

1 fipx + (1 - p)y) + 1 —- f(y) = f(x+ 1 — -y + 1 — -y] = f{x) (2.14) 

p — p \ p —p J 

as we had to show. The case p < 0 is equivalent to 1 — p > 1 and thus can be proved 
like the case before (by exchanging the roles of x and y). □ 


Proposition 2.16. Let f : V —»• W be a convex- or affine-linear map. Then for 
any x±, ...,x n G V, p±, ...,p n £ K with Yff=iPj = 1 we have: 

( n \ n 

J2pj x i) =J2pjf( x j) ( 2 - 15 ) 

3 =1 / 3 =1 

Proof. We can assume that / is affine-linear. We use a proof by induction. For 
n = 2, the statement is clear by affine-linearity. Now assume that the statement is 
true for n > 2: 
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Let the pj be labelled such that Yfj=iPj O.We rewrite: 


' n -\-1 


/ ( Y,Pj v j I = f ( Y.P* 

J =1 


ifc=i 


Pj 


7=1 Z-^a=l "a, 


+ Pn+lt’n+l 


) 


Pj 


= Y,Pkf E v rT „ Vj ) +p n+ i/(u n+ i) 

fc=l \j=l2^a=lPa 


n n 


Pj 


P k sr^n „ 
fc=l j =1 2^ a =l Fa 


/(«*) +p n +l/K+l) 


where we first used affine-linearity and then the induction hypothesis. 


(2.16) 

(2.17) 

(2.18) 

□ 


Proposition 2.17. A map f : V —>■ W between finite-dimensional real vector spaces 
is affine- (or convex-) linear exactly if it is of the form 

/(') — L(-) + y (2.19) 


for some linear map L : V —>• W and some y E W. 
Proof. Here, we only provide a sloppy proof sketch: 

'I 


l£r = + ^ ~ f(x) 


= lim — 
0 h 


f(x + hef) + 1 • /(0) - -f(x) - 1 • /(0) 

( 2 . 20 ) 


= lim — 

h —±0 fl 


f[\( x + he j ) + 1 • 0 - \ ■ x 


- m 

= lim — 

h—± 0 fl 

f Q • he ?) - /( o) 



(2.21) 


Here, ej = ( 5kj)k is the vector filled with zeroes except for the j-th component which 
is a 1. 

Thus the partial derivatives are constant. Therefore, “/ = linear map + constant”. 
See also Theorem 1.5.2 from [T5j for a more detailed proof. □ 


Definition 2.18. Let V be a real vector space. A non-empty subset K of V is a 
cone if the following conditions are satisfied mi: 

1. K + K c K 


2. pK c K Vp > 0 

3. K n (-K) = {0} 


A typical example for a cone is an infinitely long ice-cream cone. Another typical 
example, which looks like an infinitely long pyramid turned upside-down, is shown 


in Figure 2.3 
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Figure 2.3: A typical cone. Note that cones are infinitely long, as indicated in the figure. 
Proposition 2.19. Let K be a cone. Then span(K) = K — K. 

Proof. K — K C span(K) is clear by the definition of span(K). 

Let Yfj=iPjVj G span(K) with v 3 G K, pj G M. By relabelling, we assume pj > 0 for 
j < m < n and pj < 0 for j > m. Thus Yff=\ PjVj = YfffL i \Pj\ v j ~ E"= m +i \Pj\ v j- 
Thus if we can show that terms of the form J2'j= m Qj w j are i n K for qj > 0 and 
Wj G K , then we find Yff= i Vj v j £ K — K and span(K) C K — K in total. 

By the second property of cones, qjWj G K. Thus by the first property of cones, 
Tff=m QjWj G K. □ 

Definition 2.20. A cone K of a real vector space V is called generating if span(K) = 
K — K = V. 


Comment. Sometimes, the definition of cones varies in the literature. For example 
in m. the condition K n (-K) = {0} is not necessary for a set to be called cone. 
There, cones that satisfy K n (■ —K ) = {0} are called pointed. But in fWj, all cones 
are required to be generating. 


Definition 2.21. Given a cone K C V, an order unit uk is an element ofV* = 
{/ : V —> R linear}, which is strictly positive on all non-zero elements of the 


cone, i.e. 


ur(v) > 0 Vu G AT\{0}. 


(2.22) 
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3 Framework: Generalized Probabilistic Theories 

3.1 The state space 

In this chapter, we provide an introduction to the framework called Generalized 
Probabilistic Theories (GPTs). This framework includes a wide range of physical 
theories, including classical and quantum theory. It is very general, starting from 
the idea that theories should specify measurement probabilities, adding only weak 
and natural assumptions. Other assumptions and postulates can be added while 
constructing a specific theory. Thus this framework allows us to use mathematically 
well-defined postulates instead of vague motivations to single out quantum theory. 
Many other sources also give introductions to this framework, but often there are 
slightly different points of view or approaches [S3 [123 (HIPS] PD] [21], e.g. concerning 
whether measurements or states are introduced first in the theory. We will use ra, 
[IB] and [33j as an orientation. 

The basic notions of GPTs are states and measurements. The state w completely 
describes a physical system in the sense that the state determines the probabilities 
of all the measurement outcomes for all measurements. A meaningful representation 
of a state would be to just list the probabilities of all the possible measurements (Ok 
is the outcome, rrij the measurement): 


w = 


( \ \ 

p(o2\mj ) 


V 


/ 


(3.1) 


However, even for the simple example of a spin-1 system, there are infinitely many 
axis and thus infinitely many possible measurements. However, the probabilities for 
spin-up-results for measurements along the x-, y- and z-axis already determine the 
whole quantum state. Thus the example also shows, that in many cases, knowing 
the probabilities of some measurements already completely determines the outcomes 
of the other measurements. Such a set of measurements is called fiducial. 


Next we consider the following mmng-procedure: Assume we have n preparation 
devices, and each of them can prepare a state Wj, j G {1,..., n}. Furthermore assume 
that there is random number generator with n outcomes, given by the probability 
distribution (p \, ...,p n ). Now the preparation devices and the random number gen¬ 
erator are put into a black box with a single button on the outside. If you push that 
button, the random number generator is activated. If you get outcome j, device j 
is activated and system Wj is produced and sent to the outside. An example for this 


device is shown in Figure 3.1 






3.1 The state space 
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Figure 3.1: An example for the mixing procedure described in the text: Pushing the green 
button activates a random number generator (here, a die) and depending on 
the outcome (here: odd or even), one of several states is prepared. 

As everything happens within the black box, we never learn the result of the random 
number generator. Thus we only know that in pj of the cases, the system Wj is 
obtained. We wish to describe the states the box outputs by something that says 
“with probability pj you get the results expected for Wj”, i.e. a statistical mixture. 
If we consider the representation with fiducial probability vectors, 


/ \ 

P u) (o 2 \m k ) 

v ; ) 


(3.2) 


we now show that it is meaningful to assume that the new state can be written as 


n 


w = j2pjWj 
i=i 


/ \ 

E"=i Pj ■ P^ j) (°i\m k ) 
Z]=iPj ■ P U] (o 2 \m k ) 

V 


(3.3) 


With probability pj , the state is Wj. In case the state is Wj , for measurement m k 
the outcome o* occurs with probability P^\oi\m k ). Thus the total probability for 
the outcome Oj of measurement m k is given by YJj=iPj ■ P^(o l \rn k ). So the list of 
probabilities of the state w should be of the form J2]=iPj • P^\oi\nrik), i.e. exactly 
of the form w = J2]=iPj w j as suggested above. This result suggests that the set 
of states should be embedded into a real vector space, and that statistical mixtures 
are described by convex linear combinations. 

The black-box-random preparation device is an operational abstraction for a source 
or preparation device whose rules are not known. For example, for a random photon 
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emitted by a star, we do not know which energy or polarisation was chosen by the 
star, and it might be a different one for each emitted photon. 

A general assumption for the sake of simplicity is that the vector space is finite¬ 
dimensional, i.e. that a finite list of fiducial probabilities is sufficient in case of a 
representation via fiducial probabilities. In quantum theory, this assumption re¬ 
stricts us to finite dimensional quantum theory. This assumption allows us to sep¬ 
arate the mathematical technicalities introduced by functional analysis or topology 
from the “real physics”. For example, in finite dimension there is norm equiva¬ 
lence, i.e. the choice of the norm is less important. Furthermore we do not have to 
deal with integration measures and divergent sums. Already in finite dimension the 
proofs often are really hard because of the generality of the framework. The general 
idea is therefore to characterize the finite dimensional case first. Afterwards one can 
try to generalize the results to infinite dimension. Furthermore the “true” physi¬ 
cal theory should be capable of describing finite-dimensional systems as well. Such 
finite-dimensional systems often arise in computation, for example the qubit ion 
chains often used for quantum computation or the finite memory of regular comput¬ 
ers. Thus if a theory fails to describe such systems, it must be wrong. Furthermore, 
insights from quantum gravity, especially the holographic principle, suggest that the 
fundamental basis of nature could be discrete and might even be finite-dimensional 
(for a non-technical introduction to quantum gravity, see e.g. [38]). 

Norm equivalence and the representation by fists of fiducial probabilities suggest 
that the state space should be bounded. 

Now assume that there is an element w of the vector space, such that there is a 
sequence of states w n with lim„->.oo w n = w, i.e. w can be approached arbitrarily 
well. As no preparation procedure is perfect and as all measurement devices have 
a finite reliability, there is no practical difference between perfect preparability and 
arbitrarily good preparability. Thus we also assume that w is a state. This means 
that the set of states should be closed. In finite dimensions together with the bound¬ 
edness, this means that the set of states should be compact. 

Furthermore, it also makes sense to define “subnormalized” states. As an example, 
we consider a projective measurement V in quantum theory performed on a system 
described by a density matrix p. We call the projectors Pj. There is a probability 
of Tr (PjpPj) that the outcome j occurs, and afterwards that system is described 
by the density matrix Tl .^p p p.) ■ Instead, we can say that the state of the system is 
PjpPj , with the following interpretation: With probability Tr (PjpPj), the system 
is in the state Tr (jf p p.) after the measurement, i.e. the notation PjpPj summarizes 
both the probability and the state in case of outcome j. Using that notation, after 
averaging or forgetting the result, the total density matrix after the measurement is 
described by a sum of subnormalized states p' = Xq PjpPj- 

A slightly different usage of subnormalized states is that Tr(p) gives the probability 
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of success of preparation of the state. In case of failure, no system is output at all. 
This point of view can be related to the projective measurement from before. Only 
if measurement j occurs, the system is described by y^r^-y. If another outcome 
occurred, outcome j failed. 

This notation can be used to hide further conditions or to include implicit conditions. 
For example in the projective measurement, pj := PjpPj is used to perform another 
projective measurement Q with projectors Qk■ The probability for outcome k in the 
Q-measurement is given by Ti '(QkPjQk)- This can be rewritten as: 

THQkPtQ*) = Tr ' Tr(p,) 

= Prob(outcome k in Q\ outcome j in V) ■ Prob(outcome j in V) 

= Prob(outcome j in V, afterwards outcome k in Q) 

Thus now all probabilities calculated with pj contain the additional event that out¬ 
come j in the first measurement is obtained. 

All these applications show, that subnormalized states are not really necessary, but 
helpful to simplify notation and to put more content into a simpler expression. 
We will use a function ua to specify the normalization. Using the interpretation 
that the normalization gives the success probability, there should only be one state 
normalized to zero. This state corresponds to certain failure/no output at all. Fur¬ 
thermore, we will also consider “supernormalised” states. We do not give them a 
physical meaning. However, introducing them has many mathematical advantages, 
allowing us to use the full framework provided by cones from convex geometry. 

Now we collect our results to define: 

Definition 3.1. A tripel (A,Ha,wa) is called an abstract state space iff the 
following conditions hold: 

1. A is a finite-dimensional, real vector space. 

2. Ha C A is a convex, compact subset. 

3. A + := M> 0 • Ha is a closed, generating cone. 

4■ ua G A* is strictly positive on the non-zero elements of the cone, i.e. ua(uj) > 
0 for all w G A + with w 0. 

5. For w G A + : ua(w) = 1 w G Ha- 

Ha is called the set of (normalized) states, A + the cone of unnormalized 
states and the order unit ua gives the normalization. Furthermore, H^ 1 := {w G 
A + \ua{w) < 1 } is called the set of subnormalized states. 
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An example for a state space is shown in Figure 3.2 We note, that the definition 


of abstract state spaces is overcomplete - some properties are consequences of other 
properties. 



Figure 3.2: A state space consists of a cone of unnormalized states A+, where the nor¬ 
malization is defined by an order unit ua■ The set of normalized states is 
given by the states with ua(w) = 1. The set of subnormalized states 
is given by those states with ua(vj) < 1. The (sub)normalized states have a 
physical interpretation, the normalization gives the probability of success of 
preparation. 


3.2 Measurements 

So far, we have only defined state spaces. We also want to describe actions on the 
system, especially measurements and transformations. 

At first we consider measurements: 

Assume there is a system in the state w G 12^- We wish to perform a measurement 
with n different outcomes on the system. As the idea of a state is that it fully 
determines the outcome probabilities of all measurements, this measurement will be 
no exception. Thus it is possible to define functions e.j : 12 a —» [0,1] that give the 
probabilities ej{w) of the measurement outcomes. 

Consider a black-box preparation device, which with probability p prepares a system 
in the state uq, and with probability 1 — p in the state w 2 . The total state is w = 
pw i + (1 —p)w 2 - The probability for the j-th outcome is ej(w) = ej(pwi + (l—p)w- 2 ). 
But there is also another way to think about the black-box: With probability p, the 
system is in the state uq. In that case, the probability for the j-th outcome is ej(w\). 
In the other case, which happens with probability 1 — p, the probability for the j-th 
outcome is ej(w 2 ). The law of total probability states that the total probability is 
given by pej(w i) + (1 — p)ej(w 2 ). As both points of view describe the same situation, 
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ej(pwi + (l— p)w2) = pej(wi) + (l~p)ej(wf). Thus the functions ej are convex-linear. 
As R>o • Ha = A + and span(A + ) = A, it is reasonable to assume that the ej can be 
convex-linearly continued to A. Furthermore, the state 0 should give 0 for all proba- 

e 


bilities, i.e. e.,(0) = 0. By Proposition 2.17 
the linearity implies ej(w ) = UA{w)ej 
not measure anything. In Appendix [7 
intuition is right, i.e. that it is possible to extend the effects to linear functions on A. 


j are linear functions on A. Note, that 
- j: If there is no system prepared, we will 
we will discuss the technical details that the 


Definition 3.2. For an abstract state space, the set of effects is defined by 

Ea {e G A*|0 < e(w) < ua(w ) Vic G A+) (3.4) 

i.e. effects are linear maps e : A —> M. with 0 < e{w ) < 1 for all w G I1a• 


Definition 3.3. For effects 

e, / G Ea, we write 



VI 

e(w) < f(w) 

Vu> G I2a 

(3.5) 

or equivalently 

e < / 

e(w ) < f(w) 

Vic G A .|_ 

(3.6) 

Likewise, > is defined. 
Furthermore, we define 




e < f 

e{w) < f(w) 

Vw G Ha 

(3.7) 


and analogously >. 

Furthermore, measurement probabilities on properly normalized states should sum 
to 1. 

Definition 3.4. A measurement is a set M. = {ei,...,e n } of effects such that 
E"=i ej = u A ■ 

Comment. While we have defined measurements in a mathematical sense, it is not 
clear whether these measurements can actually be implemented in an actual exper¬ 
iment. Thus it is not clear whether the measurements are physically allowed. A 
typical assumption is the u no-restriction”-hypothesis, which claims that all math¬ 
ematically well-defined measurements are physically possible. Such an assumption 
can be justified as follows: 

Of course there can be practical limitations (insufficient control, too expensive,...), 
but also conceptual problems which forbid a measurement. An example for the latter 
one could be given by space-like separated systems, such that it is not possible to act 
on both systems at the same time. However, all these limitations are not introduced 
by quantum theory itself, but by the choice of physical system it is applied to. Just 
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like with a two-level system in quantum theory (qubit), many different physical sys¬ 
tems might be described by the same abstract state space. Thus it can happen, that 
the same mathematical measurement might be impossible for one physical system, 
but possible for another physical system. Thus one decides not to exclude any well- 
defined measurement beforehand. However, a choice of a specific physical system 
might render some measurements impossible. This means, that not the GPT makes 
a measurement impossible, but the physical system it is applied to. As we want our 
framework to be as powerful as possible, one does not exclude any measurement with¬ 
out a reason. 

We will not use the no-restriction hypothesis here, as the postulates used by us will 
ensure that all effects are physically allowed. But in the general case, if the no¬ 
restriction hypothesis is neither a postulate nor a consequence of the postulates one 
chooses, then one has to introduce an extra set which specifies the allowed effect¬ 
s/measurements. 

However, we will assume that all well-defined measurements formed by allowed ef¬ 
fects also are allowed measurements. 

The most prominent example for an impossible measurement is to measure position 
and momentum of a particle in non-relativistic quantum mechanics. As the position 
eigenstates form a basis, the measurement of the position is already normalized to 
one. Adding effects of the form “Is the particle’s momentum found in [pi,p 2 ] ?” 
would lead to a mathematically ill-defined measurement, because the total probability 
would be larger than 1. Thus this important example, which is constructed from 
allowed effects, is already mathematically forbidden because of wrong normalization. 

Assumption. Let ei,...,e n be physically allowed effects with ua > Jfj=i e j- Then 
{ei, ...,e n } can appear in a common physically allowed measurement. 

Furthermore, we assume that for any event described by an effect e, also the counter¬ 
event described by ua — e is physically allowed: 

Assumption. If e is a physically allowed effect, then so is the effect ua — e. 

If two effects e\, e 2 can appear in a common measurement, then also ei + e 2 should be 
a physically valid effect. It can be obtained by assigning a new combined outcome 
to e\ and e 2 which does not distinguish any more if e\ or e 2 was triggered. 

Assumption. If e i,e 2 are physically allowed effects with e\ + e 2 < ua, then also 
e\ + e 2 is a physically valid effect. 

Definition 3.5. A set of states wi,...,w n G is called perfectly distinguish¬ 
able, if there is a set of allowed effects ei,,..,e n , which can appear in a common 
measurement with Jfj e 3 < ua and for which 


{u'fi dj i, 


(3.8) 
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Proposition 3.6. If w±, ...,w n are perfectly distinguishable, there also exists a prop¬ 
erly normalized allowed measurement e[,...,e' n with Ej e j — u a and e' 3 {wk) = Sjk- 


Proof. For j < n, set e'- := e 3 . These trivially fulfil e' 3 (wk) = Sjk- Furthermore, set 
e' n Ua — YZfZl e r Thus we obtain a properly normalized measurement. e' n is a 
effect, because ua and e 3 are linear and 0 < ua — YZjZl e j < u a ■ 

By our assumptions, e! n is a physically valid effect, and 

71—1 71—1 

e' n (w 3 ) = 1-^2 e k (wj) = l-J2 S jk = $nj ( 3 . 9 ) 

k= 1 k= 1 

The last equality holds, because for j < n, E k Zi Sjk — 1, while for j = n we find 
EkZl S jk = 0. □ 

Physically, this measurement can be implemented by using the measurement which 
includes ei,...,e„. If we do not obtain outcome l,...,n—1, we say we obtained outcome 
n. Thus the event with outcome n is the counter-event for the event outcome l,...,n-2 
or n-1 obtained and thus is indeed described by ua — EfZi e j- 

Example 3.7. We consider the affine hyperplane given by El a, be. uf 1 ( 1) and 
choose an origin in this plane such that Wa( 1) can be considered a vector space. 
All vectors we will consider now will be vectors in Effects are convex-linear, 

and therefore of the form e = L(-) + y with y a constant, L a linear map. Thus 
there exists a vector v with e(w) = iF ■ w + y. The sets with constant values 
e -1 (a) = {w\if r ■ w + y = a} define affine hyperplanes in u] 4 1 ( 1 ) with v as nor¬ 
mal vectors. 


Vice versa, two parallel hyperplanes in u^ 1 (l) can be used to define an unique linear 
functional in A* (see also Figure \3Zfy : 

In the vector space uf 1 ( 1), we think of two parallel hyperplanes: {in | v r ■ w = a'} 
and {tc | iF-w = b'}- Hereby, v is the normal vector of the hyperplanes. We consider 
the convex-linear map e : Ma 1 (1) —» M; 


e(w) 


IF ■ w — a' 
b' — a' 


■b + 


vF ■ w — b' 
a' — b' 


■ a 


(3.10) 


One can directly check e(w) = a if IF ■ w — a' and e{w) —b if vF ■ w — b'. Thus all 
points on the hyperplane {w \ xF ■ w = a'} are assigned the value a, the points on 
the other hyperplane get the value b. 

By Appendix [d] we know that this convex-linear map can be linearly extended to 
A. If the values a and b are chosen carefully, one can define effects in this way. 
A typical construction is to choose the parallel hyperplanes such that El a Is found 
between them, while one hyperplane defines the states with e(w) = 0 and the other 
one the states with e(w) = 1, as shown in Figure 3.3. 
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Figure 3.3: This figure shows an abstract state space and two parallel hyperplanes. These 
two planes can be used to define a convex-linear map on or a linear map 
on A. 


3.3 Transformations and operations 

Next we consider transformations: 

A transformation converts one state of a physical system into another state (possibly 
of another system): T : A + —> B + . With the same argument like for effects, trans¬ 
formations have to be convex-linear and can be extended to linear maps T : A —>■ B. 
Furthermore, transformations should not increase the normalization, as otherwise a 
physically meaningful normalized state could be changed into an unphysical super- 
normalized state. Furthermore, a transformation should convert states into states. 

Definition 3.8. Let A, B be two abstract state spaces. A transformation is a 
map T : A —>• B which satisfies: 

1. T(A .|_) C B + . This property is called “T is positive”. 

2. u b °T< ua, he . ub(T(w )) < ua{w ) fiw G A + . 

3. T is linear 


Comment. Just like with measurements and effects, not all well-defined transfor¬ 
mations have to be physically allowed. Especially, if the no-restriction hypothesis is 
not satisfied, then for every physically allowed effect e and every physically allowed 
transformation T, also eoT has to be a physically allowed effect. This requirement is 
stronger than just ub °T < ua- Analogously, for a physically allowed measurement 
{e r \r G R}, also {e r o T\r G A } has to be a physically allowed measurement. We 
will clarify the physical assumptions used in this thesis when stating the postulates 
in Chapter 
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In Quantum Theory, one usually demands a stronger property than positivity: com¬ 
plete positivity. This means that for all types of composite systems, also the transfor¬ 
mation T ® 1 has to be positive. This map means, that on one part of the composite 
system, the transformation T is applied, while the other part is not changed at all. 
However, we do not consider composite systems here. In many axiomatic deriva¬ 
tions of quantum theory, one uses a postulate called Local Tomography. It states 
that all states of composite systems can be characterised by local measurements and 
correlations between them. In m Barrett shows how from such an assumption, 
a tensor product rule for composite systems can be derived. However, we will use 
different postulates because we do not consider composite systems in this work. Note 
that all completely positive transformations are also transformations in our sense, 
thus the results we derive will also be valid for completely positive transformations. 


Definition 3.9. A transformation T : A —>• B is called reversible, ifT~ x exists and 
is transformation too. A physically allowed transformation T is called (physically) 
reversible, if T _1 exists and is a physical transformation. The set of physically 
allowed physically reversible transformations is denoted by Qa- 

Next we consider operations. 

We consider a collection {Ti, ...,T n } of transformations. We have a device which 
randomly applies exactly one of the transformations to any system which enters the 
device. With onr usual interpretation, UA°Tj(w ) gives the probability that the j-th 
transformation is applied to an incoming state w, i.e. with probability UA°Tj{w ) the 
system will be in the state With probability 1 — Ua ° Tj(w), this procedure 

fails and one of the other transformations is applied. As the total probability should 
be given by ua (i.e. 1 for a properly normalized state), the transformations should 
satisfy Yff=\ u b ° Tj = ua■ In case of a black-box device which does not tell us 
which result it obtained for j, the state of the system after leaving the device will 
be described by: 


w 


/ 


5Z u A oTj(w) 

j- u A oTj(w)^0 


Tj(v>) 
u A ° Tj(w) 


n 


Enw 

3 = 1 


(3.11) 


Definition 3.10. An operation is a collection of transformations O = {Ti, ...,T n } 
which satisfies 

n 

J2 u B°Tj=u A (3.12) 

3 = 1 

The most famous examples for operations are given by projective measurements in 
quantum theory: Here, the transformations are given by projections. Thus it is 
possible to model projective measurements by using operations. 


Comment. So far, our notion of an abstract state space only describes the structure 
of the set of states. This definition can be extended to what is sometimes called a dy¬ 
namical abstract state space (see also \18)\lJf): An abstract state space together 
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with a set of physically allowed measurements and operations (A,£Ia,ua,-Ma,Oa)- 
If one assumes that all mathematically well-defined measurements composed only of 
physically allowed effects are also physically allowed (like we do in this thesis), then 
it is sufficient to specify the set of allowed effects £a instead of the set of allowed 
measurements A4 A - U the no-restriction hypothesis is assumed to hold, then it is 
not necessary to specify M.a, £a- 

Similarly, if one assumes that all mathematically well-defined operations constructed 
from physically allowed transformations are also physically allowed, then it is suffi¬ 
cient to specify the set of allowed transformations Ta instead of the set of allowed 
operations Oa■ Furthermore, it is important that one of the main applications of 
the GPT-framework is to derive quantum theory. In many such derivations, one is 
not interested in the set of allowed operations, but rather in the set Qa of physically 
reversible transformations. The reason is that such operations map the state space 
onto itself in a reversible way; such symmetry transformations put many restrictions 
on the possible shapes of the state space and are therefore very useful in axiomatic 
derivations of quantum theory. So often an abstract state space is considered as a 
tupel (A,Qa,ua,Ga), specifying also the set of physically reversible transformations. 
When we state our postulates in Chapter [^J we will also consider (A,Qa,ua,Ga)- 

Example 3.11. In classical probability theory, in principle measurements can 
be performed without disturbing the system. In principle, it is possible to combine 
all measurements into one large measurement, from which all other probabilities can 
be derived. The assumption of finite-dimensionality means that there are only a 
finite number of outcomes. For example, a n-sided die is fully characterised by the 
probabilities for the n sides. Other probabilities, for example u Does the die show a 
prime number?" can be deduced from that. Thus, in classical probability theory, one 
assumes that it is possible to find a single fiducial measurement which describes the 
whole system. Thus the states can be be described by listing the probabilities of all 
the outcomes of that measurement: 

w = (pi,...,p n ) ( 3 . 13 ) 

The pure states are states with a predetermined outcome: = (p\ k ^) = 

V J / 

n - 

Especially, all pure states are perfectly distinguishable. All other states can be ob¬ 
tained by statistical mixtures. 


classical • COnV 
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/ 


( 3 . 14 ) 


State spaces with only finitely many extreme points are called polytopes. Finite¬ 
dimensional state spaces are polytopes. Even more, every mixed state has a unique 
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decomposition into pure states, so classical states spaces are simplices. 

Example 3.12. In Quantum Theory, the most general description of states is 
given by density operators, which include both irreducible quantum randomness and 
the classical randomness caused by ensembles or missing knowledge. Here for a 
finite-dimensional complex Hilbert space H, we define A = {hermitian operators on H}, 
TIa — {density operators on H} = {p G A \ p > 0,pt = p, Tr(p) = 1}. Note that A 
is only a real vector space. For example, 1 is hermitian, but not i ■ 1. 

There are two different cases to which the density state formalism can be applied. 

The first application is from statistical physics/thermodynamics: 

Here, density operators describe ensembles of quantum systems whose microstates 
all realize the same macrostate. If one applies a measurement of the observable A 
to the ensemble, an average value (A ) = Tr(Ap) is obtained. 

The second application is mainly used in quantum information theory: 

Like explained before for more general GPTs, density operators can be used to de¬ 
scribe quantum states whose preparation is not fully known. If we know that with 
probability pj the state \ <j>f) was prepared, then p = J2jPj \4>j) (4 > j\ Is the our best 
description of the state of the system. While it does not make sense to consider 
measurements beyond average values for ensembles, for single systems of unclear 
preparation it makes sense to consider measurements, where only one out of sev¬ 
eral outcomes is obtained. The most general such measurements are described by 
POVMs (Positive operator valued measurements): 

Let Ei, ...,E n > 0 . Jfj Ej — t, Ej — Ej. Then ej(-) := Tr(Ej-) form a measurement 
which is called a POVM. The correspondence between ej and Ej is induced by the 
self-duality of quantum theory. 

Definition 3.13. By A* + we denote the set of unnormalized effects, i.e.: 


A^_ = R> 0 -£m = (e € A*\ e{w) > 0 Vu> e A + } 


(3.15) 


A* + is also called the dual cone. 

We say a state space is (strongly) self-dual, if there exists an inner product (•, •) 
such that: 


A\ = {{•,!») | w € .4+} 


(3.16) 


Thus A .|_ and Af can be identified with each other in case of self-duality. 

3.4 Equivalent state spaces 

So far we have motivated abstract state spaces by lists of fiducial probabilities. 
However, as quantum theory and the Bloch ball suggest, sometimes other choices for 
the state space are more convenient. So some state spaces are physically equivalent, 
if they have the same convexity-structure [33J : 
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Definition 3.14. Two state spaces (A,12a,ua) and (B ,PLb,ub) are equivalent if 
there exists a bijective linear map L : A —$■ B such that L(A + ) = B + and ub°L = ua- 

Comment. If the considered state spaces also include a set of allowed effects/mea¬ 
surements/operations/(reversible) transformations, then the map L also has to con¬ 
serve these sets, e.g. £b ° L = Sa for the sets of allowed effects. 

Our definition of abstract state spaces does not start from a list of probabilities, 
but rather from any convex compact set. So it it important to note that every state 
space is equivalent to a state space that has the form of a list of probabilities: 

Theorem 3.15. Let 12a be a GPT with dim(A) = N. Then 12a is equivalent to 
a state space PLb such that all components can be found between 0 and 1, i.e. can 
represent probabilities. 

Proof. Here, we provide only a proof sketch. 

uf^(l) describes a hyperplane in A which contains 12 a- We rotate this hyperplane 
such that it is perpendicular to the xjv-axis. As 12 a is bounded, there exists a 
c > 0 such that 12 a C [— c, c ] w_1 x {1}. The N — l vectors (0, ..., 0, c, 0, ..., 0, 0) 
together with the vector (—c, —c,..., —c, 1) form a basis. We define a linear map by 
(0,..., 0, c, 0, ...,0,0) (0, ...,0, |,0, ..,0,0) and (-c, -c,.., -c, 1) ^ (0,..., 0,1). 

These new states also form a basis, thus the map is invertible. In particular, 
[—c, c]^” 1 x {1} —» [0, l]^ -1 x {1}. Thus the first N — 1 components of 12 a are 
now found in [0,1], while the last one is 1 and gives the normalization. □ 

3.5 Some mathematical properties of GPTs 

Lemma 3.16. For an abstract state space, let F C A + be a face. Then for all 
w G F, we have M> 0 • w C F. 

Proof. Let A G M>o be arbitrary. If A = 1, then A w G F trivially. 

If A > 1, then w = fXw + (1 — -()0. As F is a face, this implies 0 G F and A w G F. 

If A < 1, then Xw — A ■ w + (1 — A) ■ 0. As faces are convex, also Xw G F. □ 

There is a bijective correspondence between the faces of 12 a and A + : 

Proposition 3.17. For an abstract state space, a face F of 12a induces a face 
M> 0 • F of A + . Vice versa, a face F {0} of A + induces a face 12a H F of 12a- If 
wi, ...w m G 12a generate the face F o/12a (A + ), they also generate the corresponding 
face of A + (PLa)- Furthermore for a face F C 12a, 12a H (M>o • F) — F, and vice 

versa for a face G C A + , G {0}, we find G = R>o ■ (12 a H G). 

Proof. The proof is quite technical and is provided in Appendix [B} □ 

The following lemma is based on Phster’s Proposition 3.36 pT3] : 
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Lemma 3.18. Let e : 12 a —» [0,1] be an effect such that there exists a state w with 
e(w ) = 1 (or e(w) = 0). 

Then e^ 1 ( 1) (or e -1 (0)j is a face of Ll A . 

Proof. Let e(w) = 1 (or e(w) = 0) and w = Yfj=iVj w j be any convex combination 
of states with pj >0. Then, by convex linearity, J2j=iPj e ( w j) — e(u>) — 1 (° r 0). As 
Pj > 0 and Yfj=\Pj — 1 and e(wf) G [0,1], this requires e{wf) = 1 (resp. e{wf) = 0). 
This especially holds true for n = 2. 

Furthermore, e(w) = e{w') = 1 (or 0) implies e(pw + (1 — p)w') = pe(w ) + (1 — 
p)e(w') = p+1 — p = 1 (or p ■ 0 + (1 — p) ■ 0 = 0). Thus, e^ 1 (l) (or e^ 1 (0)) is a face 
of 12^ (they are non-empty by requirement). 

□ 


3.6 The gbit 

Now we consider our first GPT example beyond quantum and classical theory. The 
gbit or (square-bit) is a square-shaped set of normalized states, see Figure 3.4 It is 
very important, because one can interpret it as one half of a so-called PR-box with 
superstrong correlations (see e.g. [2], [12] for more explanations). Also, the gbit is 
one of the most simple state spaces which is often used as a counter-example. Here 
we will consider it to apply all the basic notions important for GPTs in a concrete 
example. 



Figure 3.4: The cone of unnormalized states A + and the square-shaped set of normalized 
states Ll A for the gbit. 


Definition 3.19. Let A = M 3 . WesetLl A := conv(wi, W2, W3, W4) 


where w\ 






0 < a, b < 1 




The idea behind that choice is, that every state is described by two experiments 
Ei, E 2 with two outcomes x,y each. Then a normalized state can be written as 
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fp(x\E\)\ 

p(x\E 2 ) considering that p(y\Ej) = 1 — p{x\Ej). The third component gives the 

V 1 ) 

normalization, i.e. u A = pr 3 , the projector on the third component. These states 
form a square, the corners Wj of this square are those with definite results for the 
measurements and thus deserve their name pure states, because they are states of 
maximal knowledge about what will happen in these measurements. 

Now we wish to find those operations {T 4 ,T 2 }, Tj : A —>• A, which can be used 
to distinguish two opposing sides. As every side is given by the convex combina¬ 
tions of two pure states, it is enough to consider them: 

Theorem 3.20. All operations T 4 ,T 2 which can be used to distinguish the sides 
given by wi,w 4 and w 2 ,w 3 , i.e. 

1. u A (T 4 {w 4 )) = u a (T i(wi)) = 1 

2. u a {T 2 (w 2 )) = u A (T 2 {w 3 )) = 1 
have the following form: 


T 2 (a\Wi + a 3 w 3 + a 4 w 4 ) = a 3 T 2 (w 3 ) =: a 3 v' 

Ti(aiWi + a 2 w 2 + a 3 w 3 ) = aiT^wi) =: a 3 v 

with v , v' G arbitrary. 

However amorig those, only the transformations with v G {p-wi+{l—p)-wf\p G [0,1]} 
and v' G {p ■ w 2 + (1 — p) ■ w 3 \p G [0,1]} are repeatable, i.e. ( u A ° Tj) = 

1 \/w G and thus (ua ° Tj)(Tj(w )) = UA(Tj(w)). 

Proof. As UAiT^w^+UAi^w)) = 1 VroG 12 a , we find ma ( 7 i ( w 2 )) = u a (Ti{w 3 )) = 
0 and ua{T 2 {wi)) = UAiT 2 {wi)) = 0. In particular, we find: 

T\{w 2 ) = 0 Ti{w 3 ) = 0 (3.17) 

T 2 (wi) = 0 T 2 (w 4 ) = 0 

Here we used that u A as an order unit is strictly positive, i.e. 0 is the only element 
of A + that is mapped to 0. 


Now we want to construct Tj. At first we notice that arbitrary choices of three of the 


Wk gives us a basis, especially: 



= w 1 -w 4 = w 2 -w 3 , 



= w 4 -w 2 = w 4 -w 3 , 


(°\ 

0 = w 3 , thus {w 4 ,w 2 , uq} and {-uq, w 3 , w 4 } each are a basis of A. Thus we already 

v) 
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know that the Tj have a 2-dimensional kernel and rank 1. We can write: 

T 2 {a\W\ + a 3 w 3 + a 4 w 4 ) = a 3 T 2 {w 3 ) =: a 3 v' 

Ti(aiWi + a 2 w 2 + a 3 w 3 ) = a^T^wi) =: a x v 

We rewrite a\W\ + a 2 w 2 + a 3 w 3 = a 2 Wi + a 2 (uq — w 4 + w 3 ) + a 3 w 3 = (a 4 + a 2 )w± + 
(a 2 + a 3 )w 3 — a 2 w 4 to find: 

T 2 (a\W\ + ci 2 w 2 + a 3 w 3 ) = (a 2 + a 3 )v' 

Ti(aiWi + a 2 w 2 + a 3 w 3 ) = a 4 n 
(Ti + T 2 )(a 1 wi + a 2 w 2 + a 3 w 3 ) = cpu + (a 2 + a 3 )u' 

As 1 = ua(Ti + T 2 )(w) Vic G Qa and (7\ + T 2 )(wi) = v and (T 4 + T 2 )(w 2 ) = v', 

P\ P\ 

we know that v = ? , v' = ? I. As the T~ are positive, we also know that the 

w w 

missing components can be found in [0,1]. So we have v,v' G Ha- 


Claim: All v, v' G 12 a give rise to valid operations. 

1. u a {Ti(w/l)) = u a (Ti(wi)) = 1, u A (T 2 {w 2 )) = u a (T 2 (w 3 )) = 1 by v,v' G 12a 
and W 4 = W\ — w 2 + w 3 

2. Now we show positivity and that the normalization does not increase: 

As A + = M> 0 • Ha, we either have w = 0 (7}(0) = 0, so a state again), or 

( a\ ( a / c \ 

5 I — c - I 6/c 1 w ^h c > 0 and 0 < “, ~ c < 1. Thus: 

Ti(w) — T\ (a ■ (w 2 — w 3 ) + b ■ (w 1 — w 2 ) + cw 3 ) = bv and 
T 2 (w) = T 2 (a ■ (w 2 - w 3 ) + b ■ (wi - w 2 ) + cw 3 ) = (a - a - b + c)v' = (c - b)v'. 

As b > 0 and b < c both results are in A + = M> 0 • 12a again. Here one can 

also see that the Tj reduce the normalization, as before the normalization was 
c and now it is b < c or c — b < c. 

3. ua{(Ti + T 2 )(ciiWi + a 2 w 2 + a 3 w 3 )) = ua(ciiV + (a 2 + a 3 )v') = cq + a 2 + a 3 = 
UA^aiWi + a 2 w 2 + a 3 w 3 ), i.e. ua(T 2 + T 2 ) = ua- 

4. Linearity is clear. 

Thus the claim is true. 


At last we consider the consequences of repeatability: 
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We want those Tj for which 


(UA ° Ti) ((u A T o ( 7}j(m)) =1 VW£Ua (3 ' 18) 

i.e. (u A ° Tj)(Tj(w)) = UA{Tj(w)). As ua ° (Ti + T 2 ) = ua, u A strictly positive, this 
is equivalent to T\ o T 2 = T 2 o T\ — 0 on 12 a (and thus on A): 

Let k G {1, 2}, k ^ j. Then 


(u A o Tj)(Tj(w)) = u A (Tj(w )) = (u A ° Tj)(Tj(w )) + (wa o T k )(Tj(w)) (3.19) 
and thus (ua ° T k )(Tj(w )) = 0, i.e. T fc o Tj = 0. Vice versa: 

u A {Tj(w)) = (u A o Tj)(Tj(w)) + (ma ° T k)(Tj(w )) = (wa ° Tj)(Tj(w )) (3.20) 


The interpretation is clear: If we have measured result 1, a new measurement will 
not lead to result 2 because of the repeatability. 

Thus we have to choose v,v' such that T 2 ( v) = T\ (A) = 0. 

Asker(Ti) = span{w 2 , W 3 } and v' G 12a, a ny choice v' G {p-W 2 + (l—p)-W 3 \p G [0,1]} 
is valid: We surely have ?/ = a 2 w 2 + a 3 tc 3 . As u A (v') = 1, we find a 2 + a 3 = 1. 


As w 2 



(°\ 

0 and v' G Q A , we need 0 < a 2 < 1 because of the first 

w 


component. But then we also have 0 < a 3 < 1. 

As ker(T 2 ) = span{wi,w 2 — w 3 } = span{wi,w 4 } and v G 12a, any choice v G 
{p ■ wi + (1 — p) ■ wql p G [0,1]} is valid: We have v = a 3 Wi + a 4 tc 4 . As v G 12a also 


ai + a 4 = 1 . As w\ 



1 we need 0 < a 4 < 1 because of the first 

w 


component. But then also 0 < a 4 < 1. 


□ 


This shows that the measurement outcome can be used to distinguish the sides 
spanned by W\,W 4 and w 2 ,w 3 . By relabelling, it should be possible to construct 
operations that also separate the other two lines. However, it is not possible to 
separate all the corners on the square (see [E]). The reason is, as we have seen, 
that Tj will disturb its input. 


At last, we give examples of equivalent definitions of the gbit: 


1. L(wi) = w 1 , L(u; 4 ) 


(-1 

1 

\1 


L(w 2 ) = L(wi -W 4 + w 3 ) = 


, L(w 3 ) 



can be linearly extended. Then 



. L is invertible. It identifies the gbit with 
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[-1,1] 2 x {1}. 


2 . m uses a notation 


fp(x\E l )\ 

p{y\Ei) 

p(x\E 2 ) 


which contains also the probabilities for the 


\p(y\E 2 )J 

other results, but the normalization is not explicitly listed any more. It is 
given by p{x\E{) + p{y\E x ) = p{x\E 2 ) + p(y\E 2 ) =: c. 


( (a\\ 

( a ) 


f 1 

0 0\ 

can achieve this form by L b = 

c — a 

b 

, i.e. L = 

-1 

0 

0 1 

1 0 

VV C // 

\c-b) 


[0 

-1 lj 


L is a bijective map from A to span < 


f 1 ) 

-1 

0 

1 

0 

1 

5 

0 


, 

U ) 


\-v 


w. 
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4 The postulates 

After we have introduced the framework, we can finally state the postulates. These 
postulates are taken from m. where it was shown that in finite dimension, they 
single out quantum theory, i.e. quantum theory is the only theory which satisfies 
these postulates. 

ra also makes some additional weak assumptions on the set of allowed effects, 
i.e. that it is convex and closed (for similar reasons like is convex and closed) 
and that it has full dimension (to ensure that there are no different states that give 
the same probabilities for all measurements). We will not explicitly list this set as 
the postulates will have the consequence that all effects are allowed. 

4.1 Motivation of the postulates 

Consider n perfectly distinguishable pure states w\, ..., w n . The convex hull of these 
states has all the properties of a classical n-level system. The first postulate is, that 
all states are an element of a classical subspace: 

Postulate 1 [Classical decomposability/ weak spectrality] 

For every state w E there exists a probability distribution pi, ...,p n and perfectly 
distinguishable pure states w\, ...,w n such that: 

n 

w = J2PjWj ( 4 - 1 ) 

3 = 1 

This means that the only non-classical behaviour exists, because not all states have 
to belong to the same classical subspace. 

Next we consider a postulate which is important for powerful computation: 

The computation power of a classical computer does not depend on its physical im¬ 
plementation. No matter if using silicon wafers, Lego, or redstone in Minecraft [44j . 
there are many ways to build Turing machines. An important requirement is, that 
in terms of computation power, all classical implementations of a bit are equivalent. 
This property is called Bit Symmetry. This property can be expanded to all classical 
n-level systems, where n — 2 is the bit. quantum computation and quantum infor¬ 
mation usually are analyzed with the assumption that all quantum n-lcvel systems 
are equivalent, e.g. all qubits are equivalent. Especially, this assumption implies 
that in principle, it should be possible to translate an entangled state of a compos¬ 
ite system to a state of a single system and vice versa without any losses. Thus 
this assumption is crucial for the superior computation power of quantum comput¬ 
ers. In more mathematical terms, the equivalence between rr-level systems requires 
a translation-function T, which translates one n-level system into another. Fur¬ 
thermore, it should be possible to translate back. Thus PI considers (dynamical) 
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state spaces ( A , ua, Ga ) with Ga the Lie group of physically reversible physically 

allowed transformations and postulates: 

Postulate 2 [Strong symmetry / generalised bit symmetry] 

For any n G N, let W\,...,w n and vi,...,v n be sets of perfectly distinguishable pure 
states. Then there exists a reversible transformation T such that T(vjj ) = v 3 Vj. 

To simplify notation, we refer to Ga as the set of reversible transformations. Hereby 
the condition that the transformations and their inverses have to be physically al¬ 
lowed is implicit. 

The next postulate is related to the famous two-slit experiment, which was an 
important step towards the discovery of quantum physics. Like for sand falling 
through two slits, in classical physics one would expect that the probability of elec¬ 
trons passing a two-slit experiment to be just the sum of the probability of passing 
one slit, and that the resulting intensity on a detector plane was just the sum of the 
single-slit intensities. The surprising experimental result was that also matter shows 
interference. There are multiple paths electrons could take, and all these paths con¬ 
tribute with a certain phase which leads to interference patterns as summarised in 
the path integral. 

So one fundamental insight of quantum physics is, that one cannot just “add” the 
intensities and probabilities of two single slits to describe two-slit experiments. The 
reason is interference. 

Despite the fundamental importance of the two-slit experiment for the discovery of 
quantum physics, long time has passed until people started to think about three-slit 
experiments or other multi-slit experiments. One can ask similar questions like one 
once did for the two-slit experiment: If we know the behaviour of the single-slit and 
the two-slit experiments, can we also infer the behaviour of the three-slit experi¬ 
ment? Or will there be a no-trivial interference which only appears when at least 
three slits are involved? 

The surprising answer is that there is no non-trivial third- or higher-order interfer¬ 
ence. 

Postulate 3 [No third-order interference] 

There is no non-trivial third- or higher-order interference. 

To state this postulate in an exact way, one needs a lot of technical formalization 
and abstract definitions. As we will not directly use this postulate, we will not 
make the effort. The details can be found in mm, and the original framework 
of higher-order interference was first introduced by Sorkin [30]. 

The last postulates gives rise to Hamiltonian dynamics: 


Postulate 4 [Observability of energy] 

There is non-trivial reversible continuous time evolution and the generator of every 
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such evolution can be associated to an observable (energy) , which is a conserved 
quantity. 

This postulate is special in the sense, that most axiomatic derivations of quantum 
theory do not define any time evolution and also do not talk about time at all, not 
even in the final results. Just like before, we will not directly use this postulate and 
thus do not explain it in full detail. 

4.2 First consequences of the postulates 

The most important result from m is, that the 4 postulates single out the state 
space structure of quantum theory together with the unitary transformations: 

Theorem 4.1. The 4 postulates imply that the state space is an N-level state space of 
standard complex quantum theory for some N G N, and all conjugations p t-» UpW 
with U G SU(N) are contained in the group of reversible transformations. 

Proof. See Theorem 31 from m- □ 

We will now focus on the consequence of the first two postulates, i.e. classical 
decomposability and strong symmetry. We will often call them Postulates 1 and 2. 
So it is interesting to have some examples for non-trivial state spaces that satisfy 
these postulates. Especially it is important to know that there are non-quantum 
and non-classical state spaces fulfilling the postulates: 

Theorem 4.2. The possible state spaces satisfying Postulates 1, 2 and 3 which have 
a non-trivial connected component Go of their reversible transformation groups are 
the following: 

1. The d-dimensional ball state spaces := {(l,r) T |r G M d , ||r|| < 1} with d > 
2, and either Go = SO(d), or Go — SU(d/ 2 ) if d = 4, 6 , 8 ,..., or Go — U(d/ 2 ) 
if d = 2 ,4, 6 , 8 ,..., or Go — Sp(d/4) if d = 8 ,12,16,..., or Go — Sp(d/4) x U(l) 
if d = 8 ,12,16,..., or Go = Sp(d/4) x SU( 2 ) if d = 4, 8 , 12 ,..., or Go = G 2 if 
d = 7 or Go = Spin(7) if d = 8 or Go = Spin(9) if d = 16. 

2. N-level real quantum theory with N > 2 and Go = {p ^ 0p0 T \0 G SO(N)} 

3. N-level complex quantum theory with N > 2 and Go = {p ^ UpW\U G 
SU(N)} 

4- N-level quaternionic quantum theory with N > 2 and Go — Sp(N)/{ — 1 , + 1 } 

5. 3-level octonionic quantum theory with Go — Fr- 

However, among those, only the complex quantum theory state spaces (including h2 3 , 
the qubit) satisfy Postulate 4, that is, observability of energy. 

Proof. Lemma 33 from m- 0 
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The previous theorem tells us what the state spaces satisfying the first three pos¬ 
tulates are, i.e. one more than we are going to consider. It is not known whether 
the third postulate is separate from the first two or if there exist state spaces that 
satisfy Postulates 1 and 2, but not 3. 

The following definitions and results also are from The proofs are too long 

and technical to repeat them here. Also the definitions and results themselves are 
quite technical but will be needed in this thesis. Therefore, examples from quantum 
theory are used to explain them. When referring to “the Hilbert space”, we mean 
the Hilbert space of pure states. This comparison is possible, because our postulates 
provide some of the structure that the Hilbert space of pure states provides for the 
set of all states (i.e. pure and mixed states) in quantum theory. 

Bit symmetry, i.e. the special case of strong symmetry for 2-level systems, has the 
important result that the GPT is self-dual: 

Theorem 4.3. Postulates 1 and 2 imply that A + is self-dual. The inner product can 
be chosen such that all of the following properties hold: 

1. (Tw,Tv) = (w,v) for all reversible transformations T 

2. 0 < (w, v) < 1 for all w,v G 12,4 

3. (w, w) — 1 for all pure w G 12,4 and (v , v) < 1 for all mixed v G 12,4 

4■ (w,v) = 0 for all w,v G 0,4 which are perfectly distinguishable. This means 
that all perfectly distinguishable states are orthogonal. 

Proof. See Proposition 3 from m, Theorem 1 from m, Proposition 5.19 from 

ra- □ 

This inner product will be our most important tool. In quantum theory, the 
self-dualizing inner product on the space A of hermitian operators is given by 
(M, B) = Tr(AftS) = Tr (MB). 

In quantum theory, orthonormal pure states |1) (11, |2) (2|,...,|n) (n\ are perfectly 
distinguishable. We call such a set a n-frame and generalize this definition as fol¬ 
lows: 

Definition 4.4. A set of n perfectly distinguishable pure states wi,...,w n G 12,4 is 
called a n-frame. 

Postulate 2 implies that two n- -frames wi,...,w n and v lr ..,v n can be reversibly trans¬ 
formed into each other, i.e. there is a reversible transformation T with Twj = Vj. 
This implies that there is a maximal frame size, and that all smaller frames can be 
completed to a maximal frame. 
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Lemma 4.5. Let dim(A) = n, {wj}j £ j be a set of perfectly distinguishable (pure) 
states. Then \ J\ <n. 

Proof. Let {ej}j e j be the effects that distinguish the states: e k {wj ) = 5 k j. The 
Wj are linearly independent, because if they were linearly dependent, we had Wk = 
J2j^ k ajWj for some k, aj G R and thus 1 = e k (wk) = ffj^k a jek{wj) = 0. Thus the 
number of perfectly distinguishable (pure) states is no larger than the dimension of 

A. n 

In quantum theory, all bases contain the same number of states and have therefore 
the same frame-size. This generalizes to our GPTs: 

Lemma 4.6. All maximal sets of perfectly distinguishable pure states have the same 
size, that is: 

If {uq, ...w m }, {«;{, ...w^} are both sets with perfectly distinguishable pure states 
and m < n, we can find pure states u> m +i,..., w n such that {uq, ...w n } is perfectly 
distinguishable. 

Proof. Let T be the reversible transformation (Postulate 2) taking {w[, ...w' m } to 
{uq, ...w m }, i.e. T{w() = Wj Vj < m. Let e[, ..., e' n be the effects with e){w' k ) = Sj^, 
Y)j e) < ua- As T is reversible, T~ l must be normalization-preserving (T is not 
allowed to be normalization-increasing), i.e. ua ° T = ua■ For Cj := e '■ o T _1 we 
have J2f=i e j — I2j =i e j ° T' 1 < ua° T = ua- The ej are linear as composition 
of linear functions, and im(e_j) C im(e' ) C [0,1] on as T _1 (h2^ 1 ) = by 
reversibility and positivity. I.e. the ej are also effects. 

ej(T(w' k )) = e'-{w' k ) = S jk , i.e. the ej perfectly distinguish {T(w [),..., T(w' n )} = 
{wi, ...,w m , T(w ' m+1 ),..., T(w' n )}. 

Hereby, the T (u{) are also pure: Let p G (0,1), T{w ') = pw + (1 — p)w'. Then by 
linearity and bijectivity w( = pT~ 1 (w) + (l—p)T~ 1 (w'). As w) pure, w) = T _ 1 (w) = 
T” 1 (tc / ). Thus w = w' = T(w'j). □ 

Similarly to the bases of sub-Hilbert spaces in quantum theory, also faces can be 
identified by sets of perfectly distinguishable pure states. The rank generalizes the 
dimension. 

Proposition 4.7. Postulates 1 and 2 imply that every face o/Hu is generated by a 
frame. Any two frames that generate the same face F have the same size, called the 
rank of F, and denoted by |F|. Moreover, if F C G and F ^ G, then |F| < |G|. 
Every frame of size |F| in F generates F. 

Proof. Proposition 2 from 0333 □ 

The no-restriction hypothesis is satisfied: 

Proposition 4.8. Postulate 1 and 2 imply that all effects are allowed. 

Proof. Proposition 1 from m- 


□ 



4.2 First consequences of the postulates 


34 


The following proposition is analogous to the fact, that every orthonormal set (of 
pure states) can be extended to an orthonormal basis of the Hilbert space. 

Proposition 4.9. Postulates 1 and 2 imply that every frame w\,...,w n can be ex¬ 
tended to a frame w±, ...,w\a + \ that generates A + . 


Proof. Proposition 5 from ra 


□ 


Orthogonal pure states already are perfectly distinguishable, i.e. for pure states, 
orthogonality and perfect distinguishability are equivalent. Furthermore, maximal 
frames define measurements that are similar to POVM constructed from a non¬ 
degenerate projective measurement in quantum theory. Removing some frame states 
corresponds to leaving out some measurement results and thus gives a subnormalized 
measurement: 

Proposition 4.10. Postulates 1 and 2 imply that if wi, ...,w n are mutually orthog¬ 
onal pure states, then they are a frame and Yfj =i ( Wj , •) < u a. Furthermore, every 

maximal frame wi, ...,w\a + \ adds up to the order unit, i.e. Y}^i {wj, •) = ua- 

Proof. Proposition 6 from D33 □ 

Also for faces it is possible to extend frames to generating frames: 

Proposition 4.11. Suppose Postulates 1 and 2 are satisfied. If uq, ..., w n is any 
frame contained in some face F of A + , then it can be extended to a frame w\, ...,w\p\ 
of F which generates F. 

Proof. Proposition 7 from HD]. □ 

Just like the projectors in quantum theory, also the projectors considered here map 
states to states: 

Proposition 4.12. Postulates 1 and 2 imply that for every face F of A + , the or¬ 
thogonal projection Pp onto the linear span of F is positive. 

Proof. Theorem 8 from HD]. □ 

Now consider a projective measurement in quantum theory with orthogonal projec¬ 
tors Pi,..., Pm.. They map states according to p H- PjpPj. Then Tr(PjpPj) = Tr(pPj) 
is the probability that after the measurement, the state is found in the subspace given 
by Pj. A generalization of the functional Tr(Pj ■) is defined as follows: 

Definition 4.13. Let A be any system satisfying Postulates 1 and 2. Then, to every 
face F of A + , define the projective unit uf as 


uf ua ° Pf 


(4.2) 


where Pf is the orthogonal projection onto the linear span of F. 
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By using the self-duality and the symmetry of Pp, ( Uf , w) = Uf(w) = ua° Pf{w ) = 
ua(Pfu>) = ( ua,Pfu>) = (PfUa,w). Thus one can also write uf = PfUa, which is 
the original definition from ra- 

In quantum theory, Tr(Pj ■) are well-defined probability functionals. Furthermore, 
for a spectral decomposition Pj = Jfk \kj) {kj I we hud Tr(Pj •) = Tr(|kj) (kj| •), i.e. 
Tr(Pj •) = Xlk |kj) (kj| using self-duality. Here, the | kj) span the subspace Pj projects 
onto. Furthermore, two projectors can only appear in a common measurement if 
they are orthogonal. Similar results also hold for our generalizations: 

Proposition 4.14. Let A be any system satisfying Postulates 1 and 2. uf is an 
effect 0 < up < ua with up(w) = 1 \/w G F ft If w i , ...,w\f\ is any frame that 
generates F, then 

Tl 

u f = J2 w i ( 4 - 3 ) 

3 = 1 

We have up + uq < ua if and only if F and G are orthogonal. 

Proof. Lemma 11 from PI- □ 

In quantum theory, a projective measurement Pi, ■■■, P m only gives a predictable 
outcome k if the considered system is already found in the subspace onto which Pk 
projects. Something similar holds for our generalization: 

Proposition 4.15. Assume Postulates 1 and 2. Then every face F of the set of 
normalized states can be written as: 


F = {w G Ha| (u[m> 0 .f],w) = 1} 


(4.4) 


Proof. Proposition 5.29 from [12] ■ 


□ 
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5 Von Neumann’s thought experiment 

5.1 The plan 

In quantum theory, consider a density operator p with orthonormal eigenbasis | j) 
and eigenvalues pj , i.e. p = J2jPj \j) 01- 
The von Neumann entropy is defined as 

S(p) = -k B J2pj ln Pj ( 5 -i) 

j 

where OlnO := 0 by continuity. In quantum theory, all pure states are of the form 
| j) (j | . As the eigenstates are orthogonal, they are perfectly distinguishable. 

Now we consider GPTs which satisfy Postulates 1 and 2. For any state w € Qa, 
we consider a classical decomposition w = Xq Pj w j with Wj pure and perfectly 
distinguishable. A natural generalization of the von Neumann entropy is: 

S(w) = ~k B J2P] lll Pj (5.2) 

j 

Of course, the analogy to quantum theory is enough to motivate why this entropy 
definition is natural and interesting. However, it is important for us that von Neu¬ 
mann obtained his entropy by thermodynamic considerations [5]. Realizing an idea 
by J. Barrett [Tj, we will see that these considerations can be applied to many other 
GPTs as well. While it is relatively easy to introduce operationally/information- 
theoretically motivated entropies in GPTs (see e.g. m, ra), there is no straight¬ 
forward way to introduce thermodynamics to GPTs. Thus von Neumann’s thought 
experiment is an important step to provide a deep connection between information 
theory and thermodynamics also for other GPTs. 

5.2 Combining GPTs and ideal gases 

We consider a GPT ensemble [Si, S 2 , ..., Sn], which we will call the Sj-ensemble. 
Now we consider the following trick introduced by Einstein and applied by von Neu¬ 
mann: Imagine we take N hollow, small boxes Ii±, K- 2 ,..., K N and put one of the 
systems into each of the boxes. We do this in a such way that the internal system 
Sj has no interaction with its box Kj or anything else - the systems are completely 
isolated, such that they are not perturbed and thus the ensemble is not changed. 
The boxes are assumed to form an ideal classical gas. The internal GPT systems 
have no impact on the behaviour of the classical gas, as the internal systems are 
hidden from any interaction; that is except for two steps, where we will actually 
open the box and measure its content or transform it. 

So we have a classical ideal gas whose particles function as the carriers of inter¬ 
nal, passive GPT systems. A key idea is that the inner state will behave like a 
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classical label. It sounds absurd because something like that would be extremely 
hard to realize in experiment - but this is a thought experiment, and thermodynam¬ 
ics should also be capable of describing such well-defined thought experiments. 

The basic idea for the derivation of the thermodynamic entropy of a GPT ensemble 
is that of consistency: We will perform a reversible operation. We already know how 
the classical gas and its entropy will behave, and the difference between the total 
entropy change and the entropy change of the classical gas (or the heat reservoir) 
must be caused by the GPT ensemble. 

5.3 Relation between the entropies of the GPT ensemble and 
the gas 

We assume we have a w-ensemble, where w is the state of [Si,..., Sn], and a v- 
ensemble, where v is the state of [Sj,..., S' N ]. Then later on, we need to be sure, 
that the entropy difference between the gases is the same as the entropy difference 
between the internal GPT ensembles, if both gases are considered at the same con¬ 
ditions (i.e. same temperature and same volume of the tank). The idea is, that the 
gases are almost equal, the only difference in entropy being caused by the internal 
GPT entropy. 

So if we consider low temperatures, the movement of the boxes freezes out and the 
entropy of the gas is just given by its internal GPT ensemble. In this limit, we can 
imagine the gas as just a bunch of GPT systems that do not see each other, which is 
how ensembles are typically introduced in textbooks. So in this limit, the statement 
is true. 

Now we heat the two gases to the same, arbitrary temperature T. The boxes are 
assumed to be completely equal. And as the internal GPT systems are completely 
isolated, only the boxes can take any work or heat, while the internal GPT system 
is unaffected. Thus both gases have the same specific heat Cy — yy- So in order 
to heat them by ST, both need the same heat SQ. As dS = , both gases have 

the same change in entropy. Thus the entropy difference is still given by the GPT 
ensemble. 

Therefore for two such gases with same T,V,N : 

‘S'uj-gas ‘S'v-gas ‘S'jD-ensemble '^7-ensemble (5.3) 

5.4 Tool 1: Semipermeable membranes 

For von Neumann’s reasoning, one needs semipermeable membranes. In quantum 
theory, if we start with a density operator w , then we can diagonalize it. Let 
w — J2jPj \j) {j | with |j) orthonormal. We can realize w by preparing a lot of quan¬ 
tum systems, Pj being the probability that the system is prepared in the state | j). 
This ensemble now is our [Si,..., Sjv]-ensemble. As the | j) are orthonormal, one can 
imagine a semipermeable membrane: This membrane opens the boxes of incoming 
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particles and measures the internal quantum state. This measurement is a projective 
measurement in the orthonormal basis | j). We know that this measurement does 
not perturb the internal quantum state and always gives the right result. Depending 
on the state, the box is allowed to pass (a window opens) or is reflected (window 
remains closed). Von Neumann also gives a thermodynamic reasoning, that such 
a semipermeable membrane can only exist for orthogonal states. As this is also a 
standard result from quantum information theory, we will not reproduce it here. 
Also it is enough to know that there exists one preparation procedure for which the 
single-system states can be distinguished by a semipermable membrane. 

The orthonormal states | j) (j\ from above have the important property, that we 
can distinguish them without perturbation or error. I.e. if know that the system 
is one of the states | j) (j |, we can find out for sure in which one of the states the 
systems is, and we can do so without destroying the state. This reminds of classical 
physic^] The property, that a (mixed) state can be prepared by using only perfectly 
distinguishable pure states is thus called classical decomposability. Reproducing the 
von Neumann argument for more general GPTs is the reason, why we are so much 
interested in this postulate. 

Thus if a GPT fulfils Postulate 1 we can prepare arbitrary (mixed) states by only 
using perfectly distinguishable pure states w 3 , which replace the eigenbasis | j) from 
the quantum case. Then we can consider a semipermeable membrane, which uses the 
effects e, with ei(wj ) = Sij to find out the internal state of the box. One moment of 
thought is needed considering post-measurement states: We don’t want the semiper- 
meable membrane to change the internal state of the boxes. However, Postulates 1 
and 2 do not make any statement about post-measurement states. Thus we have to 
add the additional assumption, that a perfectly distinguishing measurement can be 
implemented without disturbing the states it distinguishes. This assumption is well 
motivated. First of all, one could assume that the membrane prepares the box in 
the same state it just has measured before, undoing any perturbation caused by the 
measurement. In more details, we might consider a measurement described by the 
operation Tj, ...,T n with ua ° T 3 {wk ) = Sjk and TjWk fijkVj- We assume that the 
measurement is perfect/noiseless in the sense that the pure Wj are mapped to pure 
states, i.e. v 3 pure. Then by Postulate 2, there is a reversible transformation Tj with 
TjVj = Wj. As T'j is reversible, ua ° Tj = ua- Thus the operation T[ o T \,..., T' n o T n 
induces the same measurement, but it does not disturb the Wj. Also we will see 
in Chapter [TJ that projective measurements in the style of quantum theory can be 
defined. It is reasonable, that the operations are described by projective measure¬ 
ments, as they should be repeatable. This is also the motivation for Pfister’s one 
simple postulate [131] : 


lr The convex hull of such perfectly distinguishable pure states forms a simplex and thus is a 
classical system in the sense of GPTs 
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Postulate [Pfister’s one simple postulate] 

If we can predict the outcome of a measurement with certainty, we can perform the 
measurement without altering the state: Let M = {ei, e n } be a pure measurement 
with corresponding operation {Tf, T n }. If w is a state with certain outcome, i.e. 
e k (w) = 1 for some k, then T k (w ) = w. 

We will implement this assumption by refining the postulate of classical decompos- 
ability. The basic idea of the classical decomposability is that every state is part of 
a classical subspace and that the non-classical properties only exist because there 
are different classical subspaces. Thus if we stay within a classical subspace, the 
key idea is that this subspace behaves classically. Especially, a measurement which 
perfectly distinguishes the pure states can be implemented such that it does not 
disturb the pure state. Thus we postulate: 

Postulate 1’ [Classical decomposability and classical behaviour of classical sub¬ 
spaces] 

For any state w G Ft a, there exists a probability distribution p\,...;p n and perfectly 
distinguishable pure states w\,...,w n with w = Yfj=iPj w j- A measurement which 
perfectly distinguishes the pure states can be implemented by a physically allowed 
operation 7j,..., T n which does not disturb the pure states: Tjw k = dj k w k . 

Furthermore we note that this refined postulate is only necessary for the thought 
experiment. For the mathematical definition of the entropy and the proof of its 
properties, Postulates 1 and 2 from ra will be sufficient. As already mentioned, we 
will construct projective measurements in Chapter [7} Thus the new postulate is not 
stronger than the original postulate concerning the state space structure. However, 
it does tell us that the projective measurement or a measurement with similar non¬ 
destructive properties is physically allowed. Thus we know that even without the 
stronger postulate, there is at least one mathematically well-defined operation to 
perfectly distinguish a frame. The stronger version of this postulate thus just adds 
that this or a similar operation is physically allowed, but does not add anything to 
the state space structure. 

Long story short: 

Postulate 1, refined by the assumption of non-destructive distinguishing measure¬ 
ments, is enough to obtain a semipermeable membrane. 

5.5 Tool 2: Reversible pure state conversions 

In the end we will transform pure states reversibly into other pure states. In quan¬ 
tum physics, this can be achieved by unitary time evolution e lHt or a complicated 
sequence of infinitely many measurements with infinitesimal perturbation. 

We will simply use Postulate 2 from uni. In fact, we only need a weaker form of 
this postulate. For us it is enough to know that any pure state can be reversibly 
transformed to any other one. 
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Figure 5.1: This figure visualizes all steps of von Neumann’s thought experiment. 
Now we can finally perform von Neumann’s thought experiment. All steps of the 


thought experiment are shown in Figure [5TJ It is important to recognize that, ex¬ 
cept for Tool 1 and 2, the underlying GPT, from which the internal ensemble is 
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taken, plays no role. The reason is that using the semipermeable membrane and the 
reversible state conversion, the inner GPT-state just behaves like a classical label. 
This is an important consequence of the classical decomposability, i.e. the idea that 
every system belongs to a classical subspace. 




Figure 5.2: Tank K contains our gas. In the beginning, the tank K', which is a clone of 
K, is empty. 

We assume we start with a w-ensemble [Si, ...Sn]. Following Postulate 1’, we will 
assume that the systems S 3 are prepared using only pure distinguishable states Wj , 
i.e. w = EjPjWj. Choosing N very large, we can assume that p 3 ■ N systems have 
the internal state w 3 . Thus we have access to Tool 1, the semipermeable membrane. 
This ensemble is implanted into a classical gas at temperature T confined in tank 
K of volume V. To the left, we add a tank K' of the same volume V, but empty 


(vacuum), see Figure 5.2 


We assume that we have two neighbouring walls separating the two tanks. The wall 
to the left is a standard wall, not letting through anything. We call it 1. The wall 
to the right is semipermeable (Tool 1): The boxes with internal state uq can pass 
through the semipermeable membrane, the other ones are reflected. This wall we 
call 2. Furthermore, we have another semipermeable membrane (Tool 1) at the right 
end of tank K: ft is transparent for all Wj with j ^ 1 and only reflects uq. This 


wall we call 3. The whole situation is shown in Figure 5.3 


Now we move the standard wall 1 and the right semipermeable membrane (i.e. 
3) to the left while keeping them at constant distance. We do so until wall 1 collides 
with the left end of tank K'. 

The boxes with Wj, j ^ 1 are not influenced by this procedure at all. As the walls 
are moved at same velocity, the wq-gas is also kept at constant volume, and we do 
not need to perform any work. The basic idea is, that the w 2 , w 3 , ...-gas has the 
same pressure on wall 3 from both sides and thus can be neglected. The pressure 
of the uq-gas one has to work against at wall 3 (right), is the same pressure that 
moves wall 1 (left), thus here the energy difference is also zero (the work needed at 
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the right end can be regathered from the left end). 

This way of arguing is justified by Dalton’s law (see e.g. [27j or [S5J Chapter 3.5): 
Different types of ideal gases behave, as if they were alone: For a gas which can access 
a container of volume V, the partial pressure of that gas is given by pV = NkgT, 
where N is the number of particles of that gas. The total pressure is given by the 
sum of all the partial pressures of the different types of gases. This law is a conse¬ 
quence of the fact, that the Hamiltonian of the ideal gas is modelled to include no 
particle-particle-interaction. 

W K 


1 2 3 



moved at constant distance 


Figure 5.3: The wi-boxes are separated from the rest by using three walls: The green 
wall inscribed with label 1 is a semipermeable wall which lets only w±- 
boxes pass. The red wall with label 3 is a semipermeable wall which 
lets all Wj pass, except w\ (green —> go, red —» no-go). Wall 3 and the 
standard wall 1 are moved to the left at constant distance. 

Now, all W\- boxes are in Kf while all the other boxes are in K. We separate the 
two tanks. 

Thus now we have (reversibly, without any work or heat exchange!) successfully 
isolated the uq-gas from the rest. We repeat this procedure so often, that each 
Wj- gas ends up in its own tank. 

Now we isothermally compress each tank to the volume p 3 ■ V, shown in Figure 
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PiV p 2 V p 2 V 
Figure 5.4: Each tank containing a Wj-gas is compressed to the size PjV. 

The work needed for this is: 


8W 


PjV pjV 

-£ / pdv-= - s / e^slav' 

0 v j v 

-'£p j Nk B T\ln(p j V)-\n(V)] 

3 


-Nk B TY.'P 1 Mp 1 ) 

3 


E 


p j Nk B Tln(V / ) 


PjV 

V 


The mean energy E oc T = const is constant in isothermal procedures. Thus the 
work performed on the gas is given as heat to the heat reservoir. Following dS = y-, 
the entropy of the reservoir increases by — Nk B J2jPj ln(pj), v i ce versa the entropy 
of the bunch of gases is increased by Nk B HjPj ln(pj) (negative !), as the collection 
of gases loses the heat 8Q. 


Now we apply Tool 2: All the gases are reversibly transformed into the same pure 
state w'. These gases all have the same density = y. We define that an en¬ 
semble whose systems all have the same pure state has entropy 0. We can do this, 
because all pure states can reversibly be transformed into each other (Tool 2), i.e. 
no entropy change here. It makes sense to define the pure ensemble-entropy as 0: 
All particles have the same label, it is trivial. 


The last step is that we merge all the tanks to one tank of volume V and take 
away the separating walls, see Figure [5A| Of course, we can put the walls back in, 


no entropy change here as all the tanks contained the same gases at same density 
anyway. 

Overall, we have reversibly transformed our original re-gas to a pure gas at same 
temperature and volume. The entropy change of the gas was Nk B YLjPj hi(pj). As 
we have already reasoned, this entropy change is the entropy change of the internal 
ensemble. As now the internal ensemble has entropy 0, our original w —ensemble 
had the entropy 

•S'gpt = -Nk B J2Pj ln G°j) 
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Figure 5.5: In the end, all tanks contain the same pure gas and are merged to one tank of 
size V. The resulting gas differs from the original w—gas only in its internal 
GPT state and the fact, that the entropy has changed by Nks Pj^ n (.Pj)- 

For the special case of quantum theory, p 3 are the eigenvalues of the density operator 
w and we can also write S'qt = ~ NksTifw In w). Furthermore, we can consider the 
entropy per particle 

sgpt = = ~k B Y.Pj HPj) 


5.7 Entropy from combinatorial considerations 

The derivation of the entropy was based on a purely thermodynamic argument, 
not using any combinatorial arguments as introduced by Boltzmann in statistical 
physics. We will now give a short combinatorial argument for an isolated system 
which gives us the same entropy equation as in the thermodynamic derivation. Ar¬ 
guments of that form are often used in statistical physics, e.g. a related argument 
can be found in |30 or [37]. 

Once more we consider a GPT-ensemble realized with perfectly distinguishable pure 
states, each system being put into a small box. There are N systems in total, Nj of 
them in the state Wj. Once more we assume that these boxes form a classical ideal 
gas. However, this time we assume that the container is perfectly isolated instead 
of being surrounded by a heat bath. 

The basic idea is that the inner GPT-states behave like a classical label, e.g. instead 
of distinguishing between different GPT-states, we could have boxes in different col¬ 
ors or different molecules. 


The number of states in this situation is given by 


^total ^3gas " ^GPT-configuration 


(5.4) 


There are two slightly different points of view with same result, see also Figure 5.6 
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Figure 5.6: This figure illustrates the two different points of view of the combinatorial 
argument. In the first one, we start by distributing empty (i.e. label-less) 
boxes across the phase space, and then put systems (i.e. labels) into the boxes. 
In the second one, we start with boxes containing systems/labels and distribute 
them in the phase space. 


The first point of view is that we start with empty boxes. They are indistinguishable, 
and fi gas is the standard number of states for a monoatomic ideal gas without labels 
(see e.g. [34J Equations (1.129)(1.71)), i.e.: 


r d 3N p d 3N q 

J N\h 3N 

Eq < E < Eq + 8E 


(5.5) 


Now, we put the GPT-systems into the boxes. As the boxes can be distinguished 
by their position and momentum (i.e. the phase space coordinates), there are 


n 


GPT-configuration 


V! 


NPNsl 

Thus with Equation ( 5.4h : 


ways to do this. 


r d 3N p d 3N q 

J h 3N Ni\N 2 \... 

Eq < E < Eq 8E 


(5.6) 


The other point of view is to directly incorporate the idea that the inner GPT-states 
serve as classical label, see e.g. |3Tj Equations (1.130)(1.71). Then some of the boxes 


are distinguishable and we directly find Equation (5.6). 


Next, we use Stirling’s formula 


ln(n!) ~ n ln(n) — n (5-7) 

together with Jfj Nj = N and Pj = jf, the probability that a random system in the 
ensemble is found in the state wf 
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(5.8) 

(5.9) 


3 


S. x . + k B N hi N - k B N - V .V, In A) + k B Y, N j 


(5.10) 


3 3 


N- 

= S g „-bE JVj(ln(A5j) - ln(lV)) = S 8 „ - k a N £ l>i 



(5.11) 


- 5 gas - vy p j H i> ; 


(5.12) 


3 


Thus again we hnd 


Sqpt = -NksJ^Pj hl Pj 


sgpt = ~k B J2Pi ln Pi ( 5 - 13 ) 


3 


3 


The assumption that the w 3 are perfectly distinguishable entered by the analogy 
with classical labels: If the Wj were not perfectly distinguishable, we could not dis¬ 
tinguish between all the boxes that contain different Wj at the same time. Especially 
it would not be possible for an experimenter to find out how many boxes are of type 
Wi, how many of type w 2r -- 

For example in a gbit, all states are described by statistical mixtures of the corners. 
And these corners are pairwise distinguishable, however they are not perfectly dis¬ 
tinguishable as a whole ra- Let the corners be called wi,w 2 ,w 3 ,W 4 . If we know we 
get either tty or Wj, i,j G {1, 2, 3,4} and i ^ j , we always can hnd out which one it 
is. But if we are only told it is one of the states uq, w 2 , w 3 or uq, it is not possible 
to hnd out which one of these states we got. There is no combinatorial rule from 
statistical mechanics how to count such “semi-distinguishable” configurations, that 
involve pairwise distinguishable states which are not distinguishable as a whole. The 
important property of gbits is that there are states without classical decomposition. 

So we can see in this example, that even for more general GPTs, the considera¬ 
tions from statistical physics and thermodynamics still agree. 

For the rest of Chapter [5j S will denote the entropy already divided by the number 
of systems, i.e. the entropy we called Sqpt before. 

5.8 Entropy in classical and quantum physics 

In classical physics, the continuous generalization 


S(p) - -k B j drplnp, 


(5.14) 
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where dr denotes the phase space integral, is called the Gibbs entropy and is the 
usual entropy in classical equilibrium statistical physics, see [SB] Equation (10.6.5). 
Like in the classical case, S(p) = —/csTr(plnp) is also known as the Gibbs entropy 
and used for equilibrium physics, see [36] Equation (10.6.1). 

5.9 Consistency 

The equation which we derived for the entropy depends on the coefficients of the 
classical decomposition. To be exact, so far we have only derived the entropy of a 
particular realisation of a state w (via a particular ensemble). In the worst case, 
the same state might have different classical decompositions with different entropies. 
This would mean, that the “state” w is not a complete description of the the ther¬ 
modynamic properties. “The state does not describe the state”. Adapting a proof 
found by Howard Barnum and Markus Mueller [HI] that the coefficients of a classical 
decomposition majorize the coefficients of all convex decompositions of same size, 
we can show, that all GPTs satisfying Postulates 1+2 give rise to consistent ther¬ 
modynamic entropies. 


Theorem 5.1. By Postulate 1 and 2 from f 1 Oj . the entropy of a state w is well- 
defined, i.e. it does not depend on the choice of classical decomposition. 


Proof. According to Theorem 4.3 (or HU, Proposition 3 from HD]), there is an inner 
product (■,■). 

Let w = JffjLiPjWj = YJf=\ Qj w j be a state with two classical decompositions, i.e. 
the Wj are pure and perfectly distinguishable, the p 3 form a probability distribution, 
analogously for the w' k and q 3 . Wlog we assume that the Wj and w' k are frames of 
maximal size ( adding terms of the form 0 ■ ln(0) = 0 does not change the entropy). 
We will now adapt the proof from for our own purpose. 

For perfectly distinguishable states a, b we have (a, b) = 0. For pure states w'j we 
have (uf, w'f) = 1. Thus 


1i = {«>!,“} = Y.Pi ' ( w l w i ) 


(5.15) 


We define r VJ := (w^Wj), probability vectors q,p given by {qi}i, {pi}i and a matrix 


R = ( rij)ij such that q = R ■ p. 


Now we show that R is doubly-stochastic: 

The state-space is self-dual, and by Proposition 4.10 (or Proposition 6 from ms) we 
have J2j (w'j, ■) < ua, ( w n •) < ua- Every maximal frame adds up to the order 
unit, i.e. we have W- (wL ■) = ua , J2i ( w n ■) = ua■ Thus 


5Z r b = z ( W> V w i) = u a{w[) = 1 


(5.16) 


and also rij = 1. For all states a, b we have (a, b) > 0, i.e. r t j > 0. 





5.9 Consistency 


48 


Thus q -< p by a theorem of Hardy, Littlewood and Polya (see e.g. Lemma I.2.B.2 
of |32j) and in the same way one shows q >- p. Following [32], eq. I.1.B.(2), this 
implies that q is a permutation of p, and thus the entropies agree. Q 
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6 Generalized thought experiment 

So far, we have only considered the von Neumann argument for a decomposition 
into perfectly distinguishable pure states. The interpretation behind that is: In 
QT, the pure states are of the form |-0) (-01. This state can be realized by preparing 
a system in the state |-0). The density operators are then used to describe ensembles 
of systems in such quantum states. We expect that the states of maximal knowledge, 
i.e. the pure states, should be the basic states one can prepare on a single system. 
But we can combine the two interpretations of states, i.e. missing knowledge of a 
single system versus ensemble of many fully known systems: In principle it makes 
sense that we do not know the exact pure state a system is prepared in. Then it 
makes sense to describe single systems with mixed states. Indeed, this is the inter¬ 
pretation normally applied when talking about GPTs. And we can still put these 
only partially known systems into boxes. Following that idea, we will now apply a 
generalized version of the thought experiment as found in [22] for the quantum case. 
Already Petz suggested that the argument can be used beyond quantum theory, as 
long as orthogonality makes sense. 
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Figure 6.1: Visualization of the argument which relates the thermodynamic entropy of a 
state to the entropies of a decomposition into perfectly distinguishable mixed 
states. 


So let us assume we have w = J2]=i A jWj a convex combination (X0 A j = 1 and 
A j G [0,1]) with Wj perfectly distinguishable ( ek{wj ) = Sjk, Jfk e k = ua), but not 
necessarily pure. 

Analogously to the thought experiment by von Neumann, we consider the following 
situation: In a tank with volume V, N boxes form an ideal gas. For each j, Nj of 
these boxes contain the GPT-state Wj. Thus we obtain an ensemble w = Y.j ~N w ji 
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A j — j-jr, in a volume V. 

Just like before we can construct a semi-permeable membrane. However, we need 
to assume that the membrane can be constructed such that it does not disturb the 
Wj. Like in von Neumann’s thought experiment, we use that membrane to separate 
the constituents: every Wj- gas with its Nj particles now is in a separate tank with 
volume V. We compress each tank to the volume XjV. Hereby we need to perform 

SW = - Nk B T Y X J ln ( X j)- (6.1) 

3 

As E oc T = const., the bath gets 

SQ = SW = —Nk B T Y X j ln(Aj), ( 6 . 2 ) 

j 

the gases lose SQ = SW. This means the gases have “obtained” 

AS = Nk B Y X j H X j) < 0. (6.3) 

3 

All the tanks now have the same density Uj = = ^7 which coincides with the 

density at the beginning. 

Now we go back to the very beginning: One single tank of volume V with a in- 
gas of N particles. We insert walls such that we end up with n tanks of volumes 
XjV. This step is reversible. We assume that the entropy is extensive and additive, 
i.e we have Y%=i Sj( w ) — S(w), where Sj(w ) is the entropy of the w- gas in the j-th 
tank. In the situation from the paragraph before, we have S(w) + AS = YA]=\ Sj(u>j ) 
(reversible!). Thus: 


n 1 2 Ti 

s(. w ) = J2 s i( w ) = J2 s j( w j) - Nk B J2 x j ln (\) ( 6 - 4 ) 

3 = 1 i =1 i =1 

or: 

n n n 71 

s{w) - J2 s A w j) = J2 s j( w ) - s j( w j) = ~ Nk B Y, H x j) ( 6 - 5 ) 

3 = 1 i =1 i =1 i =1 

As the j-th tank in both situations has the same macroscopic conditions (volume, 
particle number, temperature), the only difference in entropy of the j-th tank can 
be caused by the internal GPT entropy. Thus we finch 

71 71 

Yj GPT,j(w) — SGPT,j(Wj)) = —Nk B Yj X j l n (^j) 

3 = 1 i =1 


We also assume that Sgpt is extensive with Y,f=i Sgptj(u>) = Sgpt(u> ) and SGPT,j{wj) 
Tv ScpriyUj ), i-e. that a homogeneity relation holds. Here, Sgpt («/) refers to a tank 
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of volume V filled with N boxes that form a wZ-ensemblcj^} This makes sense, if we 
assume that the GPT-entropy is additive: 



put in walls 


remove walls 


w 


iv 


w 


IV 


IV 


IV 


W 


IV 


W 


Ah = Ai N No = X 2 N N 3 = X 3 N 

v~V! + V2 + v 3 Vi = a i v v 2 = x 2 v y 3 = a 3 f 

Figure 6.2: This figure visualizes the “wall removing/putting back in”-argument. 
Tanks with volume V) := A ~V, A,- = 1, hlled with a w'- gas at density A can be 


the same density, this step is reversible by putting the walls back in. Thus it makes 
sense that 

n 

E S gpt(w ', XjN, XjV) = S GPT (w', N, G). (6.6) 

3 =1 


Especially in the cases where all A j are equal (A j = A) we hnd 

nScpT (w', —, — = Sgpt(w', N, V ) S G pt (w 1 , —, —^ = —S G pt(w', N, V). 

V n n J \ n n J n 

As by the same wall removing-putting back in-argument 


mS G pr(w ', N, V ) = S G pt{w', mN,mV), 


m E N, we find in total 


S GPT (v/,-N,-v) = -S gpt (v/,N,V), 

\ n n J n 


(6.7) 


thuf0 by continuity 


S GPT (w',pN,pV ) = p ■ S gpt (w',N,V). (6.8) 

2 This is clear, if we consider the equation Scpt(w') = —Nkn ( h l ' 1 ( ii ^ N where qj are 
the coefficients of a classical decomposition. However, one should be aware that here we are 
checking for self-consistency of the entropy. The self-consistency might fail, even if the entropy 
itself is well-defined as a function. Furthermore, we later wish to apply this generalized von 
Neumann thought experiment to systems that don’t always have classical decompositions, like 
the gbit. 

3 lf you feel uncomfortable about using these extensivity/additivity arguments for the GPT 
entropy, you can instead use it on the total entropy in X0j=i Sj( w ) ~ X)y=i Sj( w j) = 
-Nk B Yfij=i Xjln(Xj), obtaining J2'j=i XjS(w) - X)"=i XjS(wj) = -Nk B X)"=i Xj ln(Aj). 
Then, we can still identify S(w) — S(wj) = Sgpt(w) — Scpriwij). 




















































































6.1 Classical mixtures and their entropy 


52 


Therefore we find E”=i S GPT ,j(w ) = S GPT (w ) and E"=i S GPT ,j(wj ) = E”=i A jS GPT (wj) 
which leads to our end result: 

n n 

S G pt(w) — ^2 Aj ■ S G pT(u)j ) — iVfc# ^ Aj ln(Aj) (6.9) 

l=i l=i 

n n 

s G pt(uj) — ^2 Aj- • s G px {wj ) — k B ^2 Xj ln(Aj) (6.10) 

l=i l=i 

where again s := jj. There is no E-dependence any more. This result coincides 
with that of Petz[22], 

It is important that we have not applied any strong symmetry argument, especially 
we have NOT assumed that the Wj can be reversibly transformed into each other. 

So far, we have only expressed the entropy of one state by the entropies of other 
states. To get a final numerical value for the entropy, we need to reduce the entropy 
of a state to states whose entropies are known (or chosen, by convention). For GPTs 
satisfying the classical decomposition postulate, one can reduce the entropy to those 
of pure states, by the strong symmetry one argues that all the pure states have the 
same entropy which we set to 0. If we had chosen another value, the entropy would 
not necessarily be extensive, unless we also chose that this additional summand also 
scales with N. 


6.1 Classical mixtures and their entropy 

In classical physics one finds for classical labels j (see e.g. pJ5j Equations (3.57) 
(3.56) (3.55) (3.54)): 

n n /N\ 

S(T , V, N,w) = J2 S(T, Vj, Nj,Wj) - k B £ Nj hi M ) (6.11) 

j=l 3=1 ViV J 

If we use Xj = = y and that the entropy is homogeneous, i.e. S(T, Nj , Vj, Wj ) = 

A jS(T,N,V,Wj) by Equation (3.39) from [3Sj, then we find: 

n n 

S(T, V, N,w) = Y I V, N, Wj ) - k B N ^ Xj In (A,) ( 6 . 12 ) 

3=1 3=1 

This result coincides with the result from the generalized von Neumann argument. 
The last term is the mixture entropy. It also shows, that we really can just add the 
mixing/internal entropy to the external entropy. 

For the rest of Chapter [6j S will denote the entropy already divided by the number 
of systems, i.e. the entropy we called Sgpt before. Furthermore, we omit kp. 



6.2 Quantum mixtures and their entropy 
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6.2 Quantum mixtures and their entropy 

Let p = Y7j =i PjPj be a convex combination with Tr(pj • p k ) oc 5jk- We want to prove 

n' n' 

s(w) = E Pj ■ s ( w j ) - E Pi ln fe) ( 6 - 13 ) 

3 =1 3= 1 

for the von Neumann entropy (ks = 1, N already divided out). 

Let |lj),... \rij) be an eigenbasis of pj with eigenvalues AjE Let (L,-),..., \f(j)j), 
where f(j) G {1, ...n}, be the eigenvectors with ^ 0. Thus we find (j k ): 

n f(j ) 

0 = Tr(pjPfc) = E (%| pjpk | dj) = E x a ] (%l Pk I a,j) 

a =1 a =1 

Thus (dj | p k |aQ = 0 for a < /(j) because p*. is positive semi-dehnite. As 1 = Tr(p fc ) 
we have 1 = E"=/(j)+i ( a jl Pk |%)• For p fc = Eb |6 fc ) (&fc| we thus find: 

n n n n 

1 = EE A I fc) • I(ajIMI 2 = E A I fc) E IE'lMI 2 

a =/ C 7")+ 1 b=1 fc=1 “=/( j )+ 1 

/(*) 

= E A l fc) E 1(^1^) I 2 

fe=l a=/0')+l 


Therefore Ea=/m+i I ( a il^fc) | 2 = 1 for all b with A^ 7 ^ 0 , which means those | bk) 
that actually appear in the eigendecomposition of p k . We used E”=i I ( a j\b k ) | J = 
1 and E= 1. Thus EL/M+l I <»jlM | 2 < 1 but we need = 1 because 


E l=i A i k \b < 1 for q b G [0,1] and one q b < 1. 

We find | (dj\b k ) | 2 = 0 for all a G {1, ../(j)}, b G {1,..., f(k)}. In nicer words: The 
eigenvectors of pj and p k , which belong to non-vanishing eigenvalues, are orthogonal 
to each other. Especially, the eigenvectors of pj and p k that actually appear in the 
decomposition are orthogonal. We have shown, that p* have support on orthogonal 
subspaces. By Theorem 11.8(4) from [53] we finally find Equation (6.13). 


6.3 Maximal consistency for GPTs 

Theorem 6.1. Consider a GPT which satisfies Postulate 1 (Classical Decomposi¬ 
tion) and Postulate 2 (Strong Symmetry) from [TUJ . 

Then the thermodynamic entropy is fully von Neumann argument- compatible, 
i.e. it is fully compatible with decompositions into perfectly distinguishable (mixed) 
states and their semipermeable membranelS^J: 

Let w = Ej PjWj be a convex combination of perfectly distinguishable states (they 
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are allowed to be mixed). Then: 


S(w) = Y,Pi S ( w j) ~ HPi ln Pi 

o j 


(6.14) 


Proof. By Postulate 1, the Wj have classical decompositions vj 1 — Efc Qk >w k > ■ With¬ 


out loss of generality, we assume q k } > 0. By Proposition |3.6[ leaving out effects 


from the measurement causes no problems. There are effects with e 7 (w a ) = 5 


3.18 




By Lemma 

sition 2.7 from we find w k c y±j vn. , i.c. 


3.18 


ja- 


3 (1) is a face of By our proof of Lemma 

Vfc , i.e. en(w k ^) = 1. Furthermore, 

1 { rA i n . i fn / . r\ 4- f r. I I 1 r i r . ( n 1 1 L ) \ 


or Propo- 

G ej^l) Vfc , i.e. e 
as e a (wj) = 0 Va p j and e“ 1 (0) is a face too, also e a (w k ’) = 0. Thus in total, 


e n \w 


Uh 


= pa- 


So far, by perfect distinguishability we have ( Wj , w k ) oc (w 


O') 


™P) = b 


ka 


(equal¬ 


ity holds because pure states are normalized to 1 by (•,•)). As e a {wp) = <5 aj -, for 
j p k also wp and wP are perfectly distinguishable by e 3 and e k - Thus we also 
have {w[)\ wp) =0 Wj p k. Thus in total (wP, wpp = S a bpk (equality because 
pure). 

Therefore all the wp form a frame and are perfectly distinguishable. 

Thus w = E 3 ,kPjQk )w k \ E,./,- PjQk : = E jPj = 1, PjQk J) e [°> 1], is a classical decom¬ 
position. Especially, by our definition of the thermodynamic entropy: 


s H 


-Erf HpA j) ) = -T,pA j) ln (4 J) ) - J2pA ] HpP 

j,k j,k j,k 

Y.Pr S ( w i) ~ J2PiHPj) 

j j 


(6.15) 


□ 


6.4 Gbits do not satisfy maximal consistency 


The gbit does not allow a classical decomposition in general. But every state can be 
decomposed into perfectly distinguishable mixed states found in opposing edges of 
the gbit. As we have already seen in Chapter [3j the corresponding membrane can 
be constructed such that it does not perturb the states being distinguished by it: 
the transformations corresponding to effects distinguishing opposing edges (and only 
those can be distinguished) can be chosen such that the faces collapse to an arbitrary 
state in that edge. And those states have a classical decomposition. So the idea is 
as follows, see Figure 6.3 By using the equation of the von Neumann argument for 


decompositions into mixed states, we reduce the entropy of an arbitrary state to the 
entropy of states in the boundary: 

For a convex combination w = E jPj w j with w) perfectly distinguishable but not 










6.4 Gbits do not satisfy maximal consistency 


55 


necessarily pure, we have 


n n 

s(w) = J2 pj- s ( w 'j ) - Mpj) 

3 = 1 i =1 

We choose the w' rj such that they are found in opposing edges of the gbit. Then we 
use the same equation to reduce the entropies S(wj) to the entropies of the pure 
states, i.e. the corners. As a corner can be reversibly transformed into any other 
corner by a rotation, we assume that all corners have the same entropy ( 0 by con¬ 
vention). This makes sense, as no corner is special. 


The basic question is: Is this a self-consistent way to define the thermodynamic 
entropy for a gbit? It is not, as we will show now. 

w i w 2 w i W2 



Figure 6.3: Example for how to decompose an arbitrary non-pure state into perfectly dis¬ 
tinguishable (mixed) states. If more than one such state is necessary, they can 
always be chosen in opposing edges of the gbit. 

Clockwise starting in the upper-left corner, we denote the corners by wi,w 2 ,w 3 ,w 4 . 
We consider the “maximally mixed state” 


1 /I 1 \ 1 /I 1 \ . 

w = 2 h” 1 + 2 W V + 2 W 3 + 2 Wi ) (fU6) 

11111111 . 

= -w 1 + -w 2 + -w 3 + -w 4 = - Wl + -w 3 = -w 2 + -VM (6.17) 


found in the center of the square. There are many ways how this state can be decom¬ 
posed into states found in the boundary, it also does have classical decompositions. 
For arbitrary a G [0,1] we define v a a ■ w 4 + (1 — a) ■ w 2 . Then, we also define 
v' a a ■ w 3 + (1 — a) ■ w 4 . Thus we find: 


1 

2 


V a + 




= aw + (1 - a)w = w 
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Figure 6.4: This figure visualizes how we choose v a and v' a as decompositions of the “max¬ 
imally mixed state”. 

v a and v' a are perfectly distinguishable, because they are found in opposing edges of 
the square. So our entropy should be: 

S(w) = ^ ■ S(v a ) + ^ • S(v' a ) - k B 2 ■ Qln (^) ) = ^ ' ( s ( v a) + S(v' a )) + k B In 2 

While the equation of the entropy did not require that v a and v' a can be reversibly 
transformed into each other, here it is possible by a 180 degree rotation: T(wf) = 
w 3 ,T(w 2 ) = w 4l T(w 3 ) = wi,T(w 4 ) = w 2 , i.e. T(v a ) = v' a ,T(v' a ) = v a . This has the 
consequence that both states have the same entropy, which we will now determine: 

S(v a ) = a ■ S(w i) + (1 — a) ■ S(w 2 ) — k B a ln(a) — k B ( 1 — a) ln(l — a) 

= — k B a ln(a) — k B {l — a) ln(l — a) (6.18) 

We used our convention S(wj) = 0 and that neighbouring corners can be perfectly 
distinguished, because there are two opposing edges which contain these corners. In 
the same way: S(v' a ) = —k B aln(a) — k B ( 1 — a) ln(l — a). So in total, we find for 
entropy of the center: 

S(w) = —k B aln(a) — k B ( 1 — a) ln(l — a) + k B In 2 (6.19) 

This result is not self-consistent, as it still depends on a, i.e. on the chosen decom¬ 
position. This shows that it is not possible to define an entropy on the states of 
the gbit which is compatible with von Neumann’s thermodynamic argument. This 
entropy varies from In2 for a = 0,1 to 2In 2 for a — 
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7 Projective measurements for GPTs 

So far we have considered membranes as in von Neumann’s original argument for 
the derivation of the von Neumann entropy. While there are measurements which 
can perfectly distinguish the states considered in the argument, it is not clear what 
happens to these states. Now we want to make the measurement more concrete, by 
modelling it as an operation. Like in quantum theory, there exist projective mea¬ 
surements that have the desired properties. At first we will consider projectors onto 
“minimal” faces, each of them generated by only one pure state. These projectors 
can be used to perfectly distinguish the states of (maximal) frames. Afterwards we 
will consider more general projective measurements as needed to distinguish per¬ 
fectly distinguishable states which in general are not pure. 


7.1 Projective measurements for frames 


Here, we consider projective measurements which perfectly distinguish the elements 
of (maximal) frames. In quantum theory, these measurements correspond to the 
measurement of non-degenerate observables as we will explain in Chapter [8j 
Remember from Section [4] that every pure state generates a face while every face 
is generated by a frame of one or more pure states. For every face F there is a 
positive, symmetric projector Pp onto the linear span of the face. Up — ua ° Pf is 
an effect and is called the projective order unit of the face. If V\, ...,v k is a frame 
that generates F, then uf = (Xq =1 Vj, ■) or uf = Xq=i v j using self-duality. 

Lemma 7.1. Suppose that strong symmetry and classical decomposability are sat¬ 
isfied. The face F := M> 0 • {w} for w pure has up = w because F is generated by 
w. Furthermore, for a frame w\, ...,w n and j ^ k, we find ua ° P 3 {w k ) = Uj(w k ) = 
(wj,Wk) = 0 , where Uj and Pj are the projective unit and the orthogonal projector 
of the face M>o • {wj}. As Pj is positive and 0 G A + is the only state normalised to 
0, we find Pj(w k ) = 0. 


Theorem 7.2. Assume strong symmetry and classical decomposability. Let wi,...,w n 
be a maximal /ram^j Let Uj be the projective unit, Pj the orthogonal projector cor¬ 
responding to the face Fj := M> 0 ■ {wj}. Then {Pi, ...,P n } form a valid operation. 
They induce a measurement which perfectly distinguishes the elements of the frame: 
u j(wk ) = ua° Pj(w k ) = Sjk- Furthermore, Pj(w k ) = Sj k w k . Thus the measurement 
does not disturb the frame. 


Proof. In Lemma 7.1 we have seen Pj(w k ) = Sj k w k for j ^ k. As projectors are 
surjective onto the linear spans of the corresponding faces (as implied by the word 


‘onto”), there is a vector w G 


A with Wj = PjW. 


PjPj = Pj, we find PjWj = PjPjW = PjW = Wj 


i.e. 


As Pj is a projector and thus 
Pj{w k ) = 5 jk w k for j = k. 


As 0 < Uj = Ua ° Pj < ua, the projectors induce valid effects and especially are 
4 Remember that every frame can be extended to a maximal frame. 
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normalization-non-increasing. Furthermore, we already know that the projectors are 
positive. Thus the projectors are valid transformations. As maximal frames add up 
to the order unit, we find J2j U A ° Pj = J2j u j — Hj w j — U A, he. a full measurement 
is obtained. So in total we have a full operation. It perfectly distinguishes the frame, 
because Uj(wk) = ua° Pj{wu ) = Sjk- 0 

Corollary 7.3. The operation constructed above satisfies all properties needed for 
the membrane in the von Neumann thought experiment. 

Comment. While we have shown, that the operation is mathematically well-defined, 
it is not clear whether the projectors actually are physically allowed. The postulates 
from m do not consider non-reversible transformations and especially do not as¬ 
sume that the projectors which model the M-slit experiments in Postulate 3 are ac¬ 
tually physically allowed. The von Neumann thought experiment gives a good reason 
to assume that projectors are physically allowed. 

One might motivate this assumption similarly to Pfister’s u one simple postulate ”JT3f: 
As each of the frame elements will have a clear, predetermined outcome, it should 
be possible to do that measurement without disturbing. Another motivation is, that 
a frame generates a classical subspace (see motivation for Postulate 1). If we only 
consider the Wj and the Uj, then we only work in that classical subspace, i.e. we 
have a classical behaviour. In classical physics, in principle measurements can be 
done without disturbing the system. Furthermore, as M-slit experiments are built 
by a slit-plane followed by a detector-plane, it is natural to assume that the system 
survives the slit-plane. 


7.2 General projective measurements 

While the measurements from the previous chapter fit to the original von Neumann 
argument, the generalized version found in Petz |22| needs similar statements for 
more general projective measurements. This means now we want to find measure¬ 
ments that perfectly distinguish some perfectly distinguishable states, which might 
be mixed, but without disturbing these states. 


Lemma 7.4. Assume strong symmetry and classical decomposability. Letw i,..., w n e 
12 a be perfectly distinguishable, but not necessarily pure. Let Fj C 12,4 be the minimal 
faces that contain Wj. Then Fj C Fj C ejT^O) for j k where e 3 are effects 

that perfectly distinguish the Wk, he. ej(wk ) = Sjk- Furthermore, the faces Fj are 
orthogonal to each other. The same is true for the corresponding faces of A + . 


Proof. By the definition of perfectly distinguishable, the re exi st effects with ej{wk) = 
Sjk- For j k , e7 1 (0) and ej 1 ( 1) are faces (see Lemma 3.18) which contain w 3 . As 
Fj is the minimal face which contains Wj , we find Fj C ej 1 (l), Fj C e^O) for 


j k. 


As the faces Fj are contained in e J 


-i/ 


(1), e k 1 (0) for j k, these effects also perfectly 


distinguish the faces Fj, i.e. they are orthogonal by Theorem 4.3 


Thus also (M>o • Fj,R> 0 ■ F k ) = {0} for j k. 


B 
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Lemma 7.5. Suppose strong symmetry and classical decomposability are satisfied. 
Let w\, ...,w n G 12 a be perfectly distinguishable, but not necessarily pure. Let Fj C 


12 a be the minimal faces that contain Wj. Let Pj 
the linear span o/M> 0 • Fj. Then PjWk = 5j k Wk- 

Proof. We apply the same tricks as before: 


be the symmetric projection onto 


As Wj G im(Pj) 


Let Wi \ be frames that generate M > 0 -Fj and Uj the corresponding projective 

unit. Then Uj = l w k ' > ■ Especially ua° Pj = Uj = Jf{=i w k^- By Lemma 


,C?) 


3w G A : PjW = Wj. 


Thus PjWj = PjPjW = PjW = Wj. 


7.4 


the 

faces are orthogonal and thus ua ° Pj(wk) = 0 for j k, i.e. Pj{w>k ) = 0 for j k 
by positivity. D 


Theorem 7.6. Assume strong symmetry and classical decomposability. Letwi, ...,w n G 
12 a be perfectly distinguishable, but not necessarily pure or maximal. Then there ex¬ 
ists a projective measurement that perfectly distinguishes the Wk in the following 
sense: 

3Pi ,..., P n +f( n +i) orthogonal positive projectors that form an operation with 
Etl {n+1) u A oP k = u A and u A ° Pj(w k ) = S jk for j G {1, ..., n + f(n + 1)} and 
k G {!,... ,n}. 


Proof. We consider the minimal faces Fj C 12a that contain Wj. 


Let Wi\ ..., w'jf-jj be frames that generate M> 0 -Fj and Uj the corresponding projective 

As the w)p are pairwise orthogonal pure states, by 


,(j) 


unit, i.e. Uj 


_ y^/(i) O') 

— 2^k= 1 w k ■ 


Proposition 4.10 (or Proposition 6 from m) they are a frame, which can be extended 
to a maximal frame. We will call the new frame elements Also 

consider the positive orthogonal projectors P n +k onto the faces M>o • Wk ( n+ P. 

Then 0 < Uj = ua ° Pj < ua for j — 1, 2,..., n + f(n + 1). As the projectors are also 
positive, they are valid transformations. 

Furthermore: 


_ p I Y^7( n +1) .. „p _ y V J W ' 1 _L V J ^' lt ^ _ n, 

2^a u A°Pa — 2^j=l UA°Pj + 2 ^ k =1 u A°Pn+k ~ 2^j =1 2^k =1 W k + 1 w k ~ U A 

because maximal frames add up to the order unit. Thus we obtain a full projective 
operation. 

For P\,..., P n we already know Pj(w k ) = 5j k Wk- As ua ° P n +kWj = ( w k l+1 \ w j) — 0 
(alternatively, use that Ua ° Pj(wj) = 1 implies Ua ° P n +k( w j) = 0 by having a 
normalized operation), we find that only the projector Pj has a non-zero probability 
to be performed on Wj , and it does not disturb Wj. □ 

Comment. One can drop the last f(n + 1) effects if one allows the measurement 
to be normalized to less than ua■ Then one still finds ua ° Pj{wk ) = bjk- 


f(j) „„U) I Y^A n+1 ) _ 


The same considerations from the previous chapter also apply here. I.e. it is not 
clear whether the projections are physically allowed, it has to be assumed as a well- 
motivated postulate. Furthermore, the projective measurement created in this chapter 
is perfectly suited for the generalized version of von Neumann’s thought experiment 
found in TFAj . 
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8 Second law 

The most famous property of the entropy is that it fulfils the second law of thermo¬ 
dynamics. At first we will discuss whether the second law holds in time evolution. 
Afterwards we will explain why even in quantum theory some quantum operations 
decrease entropy. This implies that an increase in entropy can only be shown for 
some subsets of the set of all quantum operations. We will therefore analyse projec¬ 
tive measurements, as they are a central part of quantum theory. At last, we show 
that the entropy also never decreases during mixing processes. 

From now on, S will always denote the entropy divided by the number of sys¬ 
tems, i.e. the entropy we called Sgpt before. Furthermore we omit ks■ The only 
exception will be Chapter |8.6[ where the situation from the thermodynamic thought 
experiment will be used one more time. 

8.1 Time evolution 

Postulate 4 from ra implies that time evolution is reversible and thus does not 
change the entropy. In particular, the 2nd law of thermodynamics is valid in dy¬ 
namics. However, it is clear that Postulate 4 is much stronger than what we actually 
need in order to ensure that time evolution does not violate the second law. As long 
as time evolution is described by a reversible transformation, the entropy is con¬ 
served and thus does not decrease. 

However, Postulates 1+2 do not say anything about time evolution, thus we will 
need an extra postulate specifying time evolution (e.g. Postulate 4). As long as 
we only consider Postulates 1+2, we cannot check if time evolution respects the 
second law because time evolution itself remains undefined. For now, we neglect the 
question if time evolution respects the second law; we consider the second law as a 
consistency requirement that any definition of time evolution should satisfy. 


8.2 Issues concerning the second law in measurements and 
transformations 


In this section we will explain why there are some processes that are able to decrease 
the entropy, making it necessary to focus on some special transformations and op¬ 
erations and proving the entropy increase for them. 

Consider an operation O = {Tj,...} with transformations Tj and J2j u A ° Tj = 1. 
Consider the action of the operation on an ensemble described by the state w. With 
probability u A ° Tj(w) the state after the operation will be • This induces 

a new ensemble: 


w 


/ 


u A oT k (w) 

{k | UA°T k (w)^0} 


T k (w) 
u A o T k (w) 




( 8 . 1 ) 
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In general, we cannot assume that such operations only increase entropy. If one 
considers a system coupled to an environment and performs a transformation on 
the composite system, there is no reason left why on the small system alone entropy 
should not be allowed to decrease. So if we want our definition of transformations 
to be as general as possible, especially to cover transformations induced by larger 
systems, we have to accept that some of them decrease entropy. 

This is already true in quantum theory, as the following example shows: Consider 
the following SWAP-operation, the mathematical details are explained in Appendix 
|Cj Whenever an electron approaches a black-box device, that device absorbs that 
electron and instead emits a new electron in a known pure state. As every incoming 
electron is replaced by a new electron, this transformation is already properly nor¬ 
malized and its physical implementation is also clear. But us this transformation is 
able to convert a mixed state into a pure state, in general it will decrease entropy. 
Also the projective measurements play an important role for the second law in 
quantum theory. In Exercise 11.15 in [23], one has to show that a measurement 
described by Mi := |0) (0|, M 2 := |0) (1| and ensemble state after the measurement 
MipMl + M 2 pM\ decreases entropy. This is quite clear, because the state after this 
measurement is |0) (0|, i.e. a pure state. 

Because of all these conceptual problems, we decide to prove the second law only for 
projective measurements in analogy to quantum theory. But different to quantum 
theory, we will not postulate that these are the fundamental measurements. 


8.3 GPT observables 

The purpose of this section is to motivate why we will consider certain projective 
measurements for checking the second law. The basic idea is to consider projective 
measurements that correspond to measuring observables. 


We consider projective measurements in analogy to quantum theory. In quantum 
theory, observables are of the form A = X a a Pa with eigenvalues a and projectors 
onto eigenspaces P a = Xj \j) a ) ( j ; a|. \j] a) (j; a | are pairwise orthogonal pure states. 
For our GPTs, we generalize observables as A = X a a Xj u>j-,a where Wj- a are pairwise 
orthogonal pure states and form a maximal frame. For fixed a, the face F a generated 
by all the Wj- a replaces the eigenspace, i.e. observables now have eigenfaces. For the 
corresponding symmetric projectors P a onto the linear spans of the faces F a , we find 
Ua ■= u A oP a = Wj. a and thus u A o J2 a Pa = Xa = X a Xj Wj. a = u A . As these 
projectors are positive, they are furthermore normalization-non-increasing because 
of u A o X a P a = u A . Thus the eigenprojectors form a valid operation. By Lemma [XX 


these projectors are mutually orthogonal. Thus ( PjV,PkW) = (v,PjPkw) = 0, which 
implies that also the faces they project onto are orthogonal. Thus the eigenfaces are 
orthogonal. The induced effects, i.e. the projective units, are u a = J2j w j;a- Such 
effects are also called sharp effects m- 


Vice versa, consider an operation given by sharp effects u a = Xj w j-,a for pure states 
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Wj-a- As 1 > Uaiwk-a) = E j {wj- a ,w k - a ) = 1 + E j^k ( Wj. ai w k -a ), for same a the w j;a 
are pairwise orthogonal. As E a u a = ua and u a (wj . a ) = 1, we find Ub(wj . a ) = 0 
for 6 ^ a and thus ( w k -b,Wj-a) = dabdjk- Thus the Wj. a can be used to define an 
observable M = E« a E? w j-.a and thus give rise to a projective measurement. In 
that sense, there is a correspondence between sharp measurements and projective 
measurements of observables. 

The important difference to quantum theory is that it is never specified, what the 
pure states actually are. In QT, we know that the pure states are induced by a 
complex Hilbert space; we do not assume this for our GPTs, thus our treatment is 
more general. 

We do not assume as a postulate that observables enter the GPT in this way. 
However, the natural generalization from quantum theory motivates to analyze the 
consequences of such an assumption in greater detail. 

Non-degenerate observables A — Ej cijWj with dj G M, dj ^ a k for j ^ k and {iVj} 
a maximal frame correspond to measurements with projective units Uj = Wj, i.e. 
faces generated by single pure states. We will call such projective measurements 
non-degenerate. Projective measurements corresponding to degenerate observ¬ 
ables will be called degenerate. 

We need to show that our notion of observables is well-defined, i.e. that the eigen¬ 
values and eigenfaces do not depend on the choice of decomposition: 

Theorem 8.1. Let A = E”=i a x E j w j,a,x be arbitrary with a x e M pairwise unequal 
and Wj : a iX a maximal frame. Assume A = E"=i b x E j w j,b,x I s a similar decomposi¬ 
tion. Then n a = rib and, except for permutation, a x = b x and E j w j.a,x = E j w j,b,x- 

Proof. Wlog assume that the a x and b x are ordered by size (cq < ci 2 < ...) and are 
numbered with x — 1, 2,.... Consider the smallest eigenvalues and assume cii 7 ^ b\, 
wlog ai < Then we use that maximal frames add up to the order unit: 

cii = (^fc,a,i ,A) = (w k!ajl ,J2bxJ2 w xi>,x) ( 8 - 2 ) 

* j 

^ ' b x (Wk,a, 1, ^ ^ y ^ x) (^ ,8 ) 

x j x 3 

The > is not a >, because all the b x > cq and at least one of the (u^aq, E j w j,b, x ) > 0 
because of UA(u)k, a , 1 ) = 1 an d ua = E x j w j,b,x■ But now we reached a contradiction. 
Thus our assumption was false. Thus d\ =b\. 

The Wj.aj generate a face F\ and the Wj t b,i generate a fac^] F{. The projective 
order units are given by tq = E j w j,a,i an d u '\ — E j w j,b,i- 

5 The F[ is used here simply as another face, and is not meant to imply that F[ is complementary 
or orthogonal to F\. 
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ai {'bVk,a, 1;«4) (Wfc.o.1, ^ ) b x ^ ) Wj,b,x) ^ ) b x (wJk.a.l i ^ (8.4) 

x j x j 

= Ol (^fc,a,l, E W J,6,l) + E bx ( W k,a,l, E W jfi,x) (8.5) 

i j 

> Cll (Wfc.a.l, X! + a l E i w k, 0 , 1 ,Y, w 3fi,x) (8-6) 

J J 

= a,I {w kt a t 1 , EE ^ j,b,x ) ®l^b4(^fc,a,l) ®1 (8-7) 

1 J 

In the above equation, the > holds if (w kja ,hJfj w j,b,x) ^ 0 for a x > 2. But then 
we had a contradiction. Therefore {w k>a ,i,J2j w j,b,x) 7^ 0 for a x > 2. Thus because 
of normalization and maximal frames, {wk, a ,i,J2j w j,b,i) = 1> i- e - u 'i( w k,a,i) — 1- In 
the same way show Ui(w k ,b,i) = 1- 

By Proposition 5.29 from ra, fin fl Fi = {w G CIa\uf 1 (u>) = 1} (Besides, this 
shows that the symmetric projections onto linear span of faces are neutral). Thus 
the frame tcy a ,i is found in F[ and the frame Wj ^.i is found in F\. As generating 
frames have the same size, we hnd \Fi\ > \F[\ and \F[\ > A]|. Thus F] = F[ |. As 
all frames within a face with generating size generate the face, Fi — F[. Thus also 
Ml M-^, i.e. Xy ^ \j.a . 1 Xy 1 ■ 

We want to obtain an inductive proof. 

The easiest way might be like this: As Xy w j,a ,i = Xy w iM > we modify the operator: 

^4 7 := *4. + L ^ w j,a ,i = *4 + L w j,b, l (8-8) 

j J 

^ T X;,l ' b ) ^ ) ^j,a,x ^ ( (^x T 4r,l ' -b) ) ( ^ 'jJi.x (8-9) 

s j x j 

Here, L is a very large number, such that now a 2 and 62 are the smallest eigenvalues 
of A 1 . We rename the index: 

a[ := a 2 , a 2 := a 3 ,..., and a! := cy +L is the last because it is the largest eigenvalue 
(for b' x similarly). The a' x are still ordered by size. Now we repeat exactly the same 
argument as before to hnd 02 = a[ = b\ = b 2 and J 2 j w j,a ,2 = Yhj w j,b,2- We repeat 
this procedure until we are done. It is important to note, that as the w ])a . x and the 
w j,b,x form maximal frames and as we prove for each index x, that the corresponding 
frames have the same size, necessarily n a = rib. be. “we will not run out of b’s while 
we still have a’s left or vice versa”. 


□ 

Corollary 8.2. The eigenvalues and eigenfaces of observables are well-defined: The 
eigenvalues are uniquely determined and u x := Xy w j,a,x — Xy w j,b,x generate the 
same eigenface F x . 
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As generating frames of faces have a unique size, the statement also shows that the 
probability distribution of classical decompositions of states is uniquely determined. 
Also for w = 22 x Px 22j w jx = 22 x Px 22 j w j X classical decompositions of a states into 
maximal frames, we find 22j w jx — 22 3 W '. JX ■ Thus login := 22 x log(/J x ) 22 j W JX = 
22 x log(p*) 22j cv'jx is independent of the choice of classical decomposition. 

Likewise for any function f : M. —> M and A = 22 x a x22jWj,a,x as before, /(A) : = 
22 x f( a x) 22 j w j,a, x is well-defined and independent of the decomposition into maximal 
frames. 

It is important to note, that the close similarity to quantum observables is caused 
by Postulates 1 and 2, especially the fact that eigenvalues and eigenfaces are well- 
defined. 

By Lemma 5.46 from [L2f every element w of A does have a generalized classical 
decomposition of the form w = 22jPj w j with {wtj} a frame and pj G M. Thus every 
element of A can be interpreted as an observable. This fact is in direct analogy to 
quantum theory where every hermitian operator is interpreted as an observable. 

Sometimes |33j . observables are introduced to GPTs in a different but equivalent 
way: 

When measuring an observable on ensembles, one obtains an average value which 
agrees with the expectation value in the thermodynamic limit. Thus it makes sense 
to introduce observables as functions A + —> M. With the same argument as for 
effects, these functions should be convex-linear and can be extended to linear func¬ 
tions A —)■ M. Therefore observables are sometimes defined as elements of A*. 

By self-duality, for every observable A* G A* there exists a A G A with A * = (A, •). 
Vice versa, for every A G A we find that A* := (A, •) is an element of A*. Thus 
the elements of A* and A are in bijective correspondence, and the definitions of 
observables as elements of A or as elements of A* are equivalent. 

When postulating the standard axioms of quantum theory, there is always a myste¬ 
rious quantum-classical transition in the measurement in the following sense: The 
measurement device, possibly quantum itself, measures a quantum property and re¬ 
turns a classical result (a digital number on a screen). In our framework considering 
the motivation for Postulate 1, we can justify this as follows: Every observable is 
part of a classical subspace. The measurement of the observable perfectly distin¬ 
guishes the eigenspaces of the observable and is a consistent generalisation of the 
classical measurement to other classical subspaces. Thus the quantum optics ex¬ 
planation (see e.g. [02] or [033) by decoherence of an open quantum system (the 
measurement device being the environment) reduces the original state to a state of 
the classical subspace of the observable leading to a completely classical behaviour 
of the observable and the state, when only actions on this classical subspace are 
performed. 
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8.4 Second law for non-degenerate projective measurements 

At first, we show the second law for non-degenerate projective measurements: 

Let w £ Q be an arbitrary state, w = YijPjWj a classical decomposition into a 
maximal frame. Let Pj be positive symmetric projectors with effects u 3 = u A ° Pj 
which form a properly normalized measurement, i.e. the Pj form an operation. 
Furthermore, assume Pj are the projectors onto the span of w'j, where the w'j form 
a maximal frame (this means the projectors correspond to rank-1 projectors from 
quantum theory, i.e. no degeneracy of the measured observable). The measure¬ 
ment is conducted on all systems of the ensemble, i.e. the ensemble state after the 
measurement is given by 


w 


/ 


u A °Pj(w ) 

{j | u A °Pj{w)^0} 


PiM 

u A o Pj(w) 




( 8 . 10 ) 


Theorem 8.3. Suppose Classical Decomp os ability and Strong Symmetry are satis¬ 
fied. Then non-degenerate projective measurements as defined above never decrease 
the entropy of an ensemble: 

S(w') > S(w) (8.11) 

Proof. Like described above, let w = YYjPj w j be a classical decomposition into a 
maximal frame. Furthermore Pj is positive and projects onto the linear span of w'j, 
i.e. im + (Pj) = M> 0 • {w'| (where im + (Pj) := im(Pj) D A + ). Via the projective units 
we see that {w'j, •) = u A o Pj- 

Thus w' = Yfj Pj(w) =: YYj Qj w 'j where q 3 = u A ° Pj(w ): 

If qj = u A o Pj(w ) = 0, then PjW = 0. Otherwise, ^p^,) = w'j by proper 
normalization and therefore Pj(w ) = u A ° Pj{w ) • w'j. Furthermore: 
q 3 = u A oPj{w ) = Y2k u A °Pj(wk)Pk ='■ Yfk MjkPk■ As maximal frames are considered, 
Mjk = u A o Pj(wk) is a square matrix with non-negative entries. Also: 

YfjMjk = Jfj ( w'j,w k ) = u A (w k ) = 1 = u A {w'j) = Efc ( w'j,w k ) = E kM jk . Thus M jk 
is doubly stochastic, and in analogy to [21] we find that entropy increases: 

S(w) = S(YfjPjWj) = H(p ) where H(p) = — E ? Pj bi p 3 is the Shannon entropjj^J 
By Birkhoff’s theorem, M = E cr&s N a <r ' ° is a convex combination of permutations. 
By Schur concavity of the Shannon entropy, H(q) > Eo-eSjv a o- H(a(p)) = H(p). 
S(w') = H(q) because the w’j form a frame. Thus S(w') > S(w). □ 


8.5 Second law for degenerate projective measurements 

Next we wish to consider also degenerate measurements, i.e. measurements that 
correspond to degenerate observables in quantum theory. For doing so, we will adapt 
a proof for the quantum case from [23] using the relative entropy, especially we will 
adapt the proof from [23] that the relative entropy is non-negative. However, as our 

6 In this proof, the Shannon entropy is defined with In instead of log 2 . Furthermore, we set fcp = 1 
to simplify notation. 
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projective measurements are not necessarily induced by projectors on an underlying 
pure state Hilbert space, we have to find a different way to take advantage of the 
fact, that the relative entropy is non-negative, than in [23] . 

Definition 8.4. Assume strong symmetry and classical decomposability are fulfilled. 
Then the relative entropy is defined as: 

S(vj\\v) \= —S(w) — (w,Inv) (8-12) 

Here, for v = Qj v j a classical decomposition into a maximal frame, Inn : = 

In quantum theory, our definition reproduces the standard definition of relative 
entropy from [23] ■ Like there, we will show that the relative entropy is never negative: 


Theorem 8.5 (Klein’s inequality). For all v,w G Ha-' 

SH|u)>0 (8.13) 

Proof. Consider classical decompositions w = Pj w j and v = Jfk QkVk into maxi¬ 
mal frames. Then: 

S'(tnllu) = lnpj - J2Pj ln Qk ( Wj,v k ) =: J^Pj ( ln Pj ~ P F ln Qk ) (8.14) 

j j,k j \ k J 

Here, Pp. := ( Wj,Vk ) > 0 and Xq Pjk = Efc Pjk = 1 because maximal frames add up 
to the order unit. Define r.j := Jfk PjkQk = (tu,-,u). In is strictly concave and thus 
Efc P]k In q k < In r j . Thus: 


S(w\ |n) > 'Y^jPi In 
j 



(8.15) 


As rj = Efc PjkQk > 0 and Xq C = E k E j PjkQk = E kQk = 1, the Tj also form a prob¬ 
ability distribution. By positivity of the classical relative entropy (see [23] .Theorem 
11.1), we thus find 

S(w\\v)>0 (8.16) 

Alternative proof from Peres’ book [2TJJ: 

S(w\ |u) = ( ln Pj ~ zK Pjk In = Y.PjPjk ln f—) (8-17) 

j V k J jk \QkJ 

We use ln(x) > 1 — - with equality exactly for x — 1: 


S(w\\v) > YlPjPjk 

jk 



Pj J 


EPi~E r i = 0 


(8.18) 
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□ 


Lemma 8.6. Assume strong symmetry and classical decomposability are fulfilled. 
Let Pj be symmetric, positive projectors which form an operation. Then PkPj = 
djkPj, i-e. the Pj are mutually orthogonal. 

Proof. If PjW = 0, then trivially PkPjW = 0. If PjW 0, then 


1 




( P )*> \ 

\( U A oP j )(w)J 

( PjW \ 

\{U A ° Pj)(w)J ' 


(8.19) 

( 8 . 20 ) 


Thus (u A o Pk) { {UA % (W) ) = 0- By positivity of P k , (u A o P k ) ( {u J% Kw) ) = 0 
for k fi j. As 0 is the only state with u A = 0, PkPjW = 0. As the cone is generating 
(Span(A + ) = A), PkPj = 0, especially ( PkW,PjW) = 0. □ 


Lemma 8.7. Let P be a positive, symmetric, normalization-non-increasing projec¬ 
tor which projects onto the linear span of a face F. For states w, Pw is always 
found in the face F. 


Proof, w a state. If Pw = 0, then trivially Pw G F. If Pw 0: By surjectivity onto 
Span(F), Uj ^p w ) = Ej PjWj f° r Pj £ ^ and Wj G F. We assume pj 0 and Wj 0 
(otherwise they do not contribute) and normalize: Wj := yyjyyy, Pj '■= u A(u>j)Pj■ w'j 
is still found in F, because Wj = t(Au^) + (1 — |)-0 for A > 1 and Wj = Awj + (1 — A)-0 
for 0 < A < 1 together with F being a face of A + imply that M>o • Wj C F. Now 
we find = E jP'j w 'j- ^- s states are normalized, E jP'j — 1- Thus, the 

state u ^p w ) is an affine combination of states in F. By Proposition 2.10 from [15] . 
F = a S(F) n A+. Thus Pw G F. □ 

Lemma 8.8. Assume Postulates 1 and 2. Let F {0} be a face of A + and w G F. 
Then there exists a classical decomposition w = Y,jPj w j which only uses states in 
F, i.e. Wj G F Vj. 

Proof. Let w = E jPj w j be a classical decomposition. Wlog p 3 > 0 Vj. As F is a 
face, we find Wj G F Vj. □ 

Now we finally consider the entropy in degenerate projective measurements. 

We remember: We consider observables A = E a a E j w j-,a■ Here a G M are the 
eigenvalues and the Wj- a form a maximal frame. The u a := E j w j-,a are the projec¬ 
tive units of the eigenfaces F a with positive symmetric projector P a . 

We know that the projective units are valid effects with u a = u A o P a . Furthermore, 
E a u a = E a.j w j;a = u A by having a maximal frame. 
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Vice versa, consider an operation given by symmetric projectors P a which project 
onto the linear spans of faces F a . We know by Proposition |4.12| that these projectors 
are positive. By Lemma 8TS the P a are mutually orthogonal and thus also the faces 
F a are orthogonal. For all a, let Wj. a be frames that generate F a . Then u a := UA°P a 
is the corresponding projective unit and by that a valid effect. As the P a form an 
operation, J2 a ,j w j;a = 12a u a = 12a u A 0 P a = ua, i.e. the Wj- a form a maximal frame 
in total. Thus we can define an observable A = J2 a au a = J2 a a Xb w j-,a- 


Theorem 8.9. Suppose strong symmetry and classical decomposability are satisfied. 
Let P a be symmetric projectors which form a valid operation and project onto the 
linear spans of faces F a . Then the induced measurement with post-measurement 
ensemble state w' = X 0 P a w does not decrease entropy: S{w') > S{w) 


Proof Let u a '■= ua ° P a be the projective order units and Wj- a frames that generate 
the F a . We have already argued that the projectors and the faces are mutually 
orthogonal. We have also seen that the Wj. a form a maximal frame in total. 


We consider — S(w ) — (w,lnw') = S'(«;||'i//) > 0. Like in [23] Theorem 11.9, we 
claim (w, ln-u/) = —S(w'). If that claim is true, then S(w') > S(w). Thus we 
only have to prove this claim, but as our theories cannot be assumed to have an 
underlying pure state Hilbert space, we will use a different proof. 


As the P a are mutually orthogonal, so are the P a w. Because of Lemma 8.7, P a w e 
F a . If P a w = 0 we use the decomposition P a w = Jf j UA(P a u>) ■ r a j ■ w„, with 


aj 


w. 


aj 
P a w 


:= Wi 


j- a and r a j := 5 ij. If P a w 0, we perform a classical decomposition 
u, pp’w) = r a.kU>ak f° r r ak > 0. Because of the classical decomposition, the w a k are 
mutually orthogonal for same a. The w a k are found in the face F a corresponding 


to the projectors P a . Therefore, we add terms r, 


a? 


w, 


aj 


= 0 • w a j to the classical 


decomposition of u ^p w ^ to complete the w a j to a generating frame of F a . As all 
generating frames of a face have the same size, also the new generating frames add 
up to the order unit in total, i.e. can be combined to a maximal frame. Thus 
P a u) a k = ui a k an d as these faces are mutually orthogonal, also for different a the w afc 
are mutually orthogonal. By orthogonality, PbW a k = PbP a u) a k = 0 for b a. 

In total, we have found a classical decomposition w' = J2 a j u A(P a ui) ■ r a j ■ w a j with 
PaWbj = SabUJbj ■ With In(V) = Jf a j ln(u A (P a uj) ■ r aj )w a j we find: 


5Z p a In(w') = X! In (u A {Pbw)-r bj )P a w bj = In {u A {P a w)-r aj )w aj = ln(-u/) (8.21) 

a a6 j aj 


Finally, using symmetry of the projectors and 


S(w’) = Y [ u A{P a w) ■ r aj In u A {P a w ) • r aj = {w\ In w') 


( 8 . 22 ) 


aj 
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for the same classical decomposition: 

(w,lnw') = (w,P a ln w’) = (y~) P a w , In w') = (w',ln w') = —S(w') (8.23) 

a a 

□ 

Comment. If you read through the proofs carefully, you will see that the proofs 
would also work if log(w) depended on the choice of classical decomposition. 


8.6 Mixing processes 


Now we consider the entropy for a mixing process, i.e. 


some gases which contain GPT-ensembles, see also Figure 8.1 


what happens if we mix 
For the last time in 


this thesis, we use the convention that S is the entropy proportional to the number 
of systems, while s = is the entropy divided by the number of systems, and we 
explicitly list ks- 


Like in von Neumann’s thought experiment, we consider boxes, each of them filled 
with a GPT system, forming an ideal gas. We consider n tanks at temperature 
T. In the j-th tank one finds Nj boxes and each of these boxes contains the state 
Wj. Furthermore, the j-tli tank is assumed to have volume Vj := 2fV, where 
N := Y?j=i Nj is the total number of boxes/systems. Note that the gases in the 
tanks all have the same density. As the tanks are isolated from each other, the total 
GPT entropy is given by before = Yhj Sjiwf) where Sj is the GPT entropy of the 
gas in the j'-th tank: For a classical decomposition Wj = J2kPk^ w k^ we thus have 
S A w j) = ~ N j ksEkP^fop?- We use the normalized entropy s which is already 
divided by the number of particles, i.e. s(v) = —Ir Ik In (Jk f° r a state v with 
classical decomposition v = Qk v k- We write before = Xy Njs(wj). The tanks are 
merged to a giant tank. The walls separating the tanks are removed such that the 
gases mix. Now, we can put the walls back in, a process that is now reversible as 
the gases are already mixed. 

The new ensemble state is given by w' := Jf j jfWj because with probability a 
random box belonged to the Wj- gas before. The total GPT entropy after this mixing 
process is given by Softer = S(w') = Ns(w'). 

We see that the tanks which originally contained the gases Wj now contain the mixed 
gas w' at same conditions T,Nj, Vj. Thus the only difference in entropy is caused 
by the GPT-systems. Therefore, we need to check 

^before = X! N j S ( W j) ^ ^after = Ns(w') (8.24) 

3 

which is equivalent to: 


(8.25) 
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Vi = 


Wl 


w 1 


Wl 


Wl 

Mi y 

N V 


N 2 w 2 
V2 = ^V 


W 1 


Wl 


W 1 



N = Ni + N 2 

W = fwi + ^U>2 



iVi Vi iV 2 f 2 

W = fwi + fw 2 



remove wall 
mixing irreversible 


insert/remove wall 
reversible: already mixed 


AS > 0 ? 


Figure 8.1: At first, different pure gases are contained in tanks of same density. Then the 
gases are mixed by removing the walls. Afterwards the walls can be put back 
in giving the tanks from before but now with mixed gases in them. Does the 
entropy in the irreversible mixing procedure increase ? 


By continuity, we thus have to check for all normalized states Wj G and prob¬ 
ability distributions pj, that sfiffjPjWj) > Pj s ( w j)- Therefore we have argued 
using a thermodynamic argument, that the entropy must be concave as function. 


As we now return to the mathematical properties of our entropy, we also return 
to the convention, that S(YfjPj w j) — — Jfj Pj hi p 3 for a classical decomposition 
EjPjWj, i.e. we set ks = 1 and divide by the number of boxes. Now we show that 
the entropy is concave: 


Theorem 8.10. Assume Postulates 1 and 2. 

The entropy is concave: Let w\, ...,w n G flu and pi, ..., p n a probability distribution. 
Then: 

S > Y,Pj S ( w j) ( 8 - 26 ) 


Proof. 


0 < E Pj s ( w j\\ ^2pkWk) 

j k 


Y.P] s ( w i) - 

i i 



Y^PjSiwj) - (V'PjWj, In Y^PkWk ) = - Y J P j S(w j ) + S 'VpjWj 
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9 Information-theoretic and operational 
considerations about the entropy 


9.1 Motivation, definitions and conventions 


So far, all our considerations about the entropy have been from a thermodynamic 
point of view. However, as the GPT framework has a very close connection to quan¬ 
tum information theory and its operational way of thinking, it is not surprising that 
there has already been some work on the entropy using an operational/information- 
theoretic approach. In p0]J and [22] (see also [01] and [32]) , two different entropies 
were defined for very general GPTs. One of them considers measurements, the other 
the construction of states. In classical and quantum theory, both agree with regular 
Shannon/von Neumann entropy. But in general GPTs, this is not necessarily in the 
case. The purpose of this section is to analyse these two entropies in the context of 
the strong structure provided by our postulates. One of the main results, Theorem 


9.4 namely that the measurement entropy coincides with our spectral definition of 


the entropy, was found in a collaboration between our group and Howard Barnumjj] 
The proof is a generalization of the proof for quantum theory found in [23], Lemma 
B.l. 


We will adapt the conventions from |24| and introduce some basic definitions about 
fine-graining of measurements, measurement entropy and decomposition entropy 
used there. Afterwards, we derive our results. 

As this chapter considers the entropy from an information-theoretic perspective, all 
entropies in this chapter are defined with log := log 2 instead of In and we omit ks- 


Let ei, ...,e n and /i,..., f m be two normalized measurements such that there exists 
a map M : {1,..., n} —* {1,..., m} with 

e i = fk Vfc G {1,..., m} (9.1) 

{j\M(j)=k} 


If M is bijective, then the measurement f is simply a re-labelling of e. If there ex¬ 
ists a k with M(j) ^ k Vj, then because of the normalization of the e-measurement, 
fk = 0 i.e. it is a trivial outcome that never happens. If M is not injective, then f 
is a coarse-graining of e (or vice versa, e a refinement of f) in the sense that f 
is obtained from e by collecting several outcomes of e and giving them a common, 
new outcome-label (and maybe adding the 0-effect a few times). In that sense, we 
do not care about which of the 6j triggered the new effect. An example is shown in 


Figure 9.1 


7 Originally, we only considered the generalization of the Shannon entropy. But the other Renyi 
entropies use the exact same proof. 
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Figure 9.1: This figure shows an example for measurements e,f such that e is a refinement 
of f. Here, the outcomes 1,3,5 of e each trigger outcome 1 off, while the 
outcomes 2,4 of e trigger outcome 2 off. Thus a f -measurement can he 
constructed from a e-measurement by checking if we have one of the outcomes 
1, 3, 5 or one of the outcomes 2,4. 

However, there exist trivial refinements/coarse-grainings: for those, ej oc fhi(j) Vj. 
We write ej = PjfM(j)- Then such a measurement can be obtained by performing 
f, and if outcome k is triggered, we activate a classical random number generator 
which generates the final outcome j among the j with M(j ) = k with probability 


Pj 

5 ~l{a\M(a)=k } Pa, 


(9.2) 


Thus a trivial refinement does not yield any additional information about the GPT- 
system. We only get additional information about the classical random number 
generator used at the end. So trivial refinements have no additional advantage 
in analyzing GPT-systems. And as we want to quantify the information of GPT- 
sources or the information missing about a GPT-system, we are not interested in 
any classical random number generator used at the end. An example is shown in 


Figure 9.2 


We call a measurement fine-grained if it does not have any non-trivial refinements. 
We call £* the set of fine-grained measurements. 



\h 

\h 

\h 

\h 

h 


Figure 9.2: This figure shows an example for measurements e,f such that e is a trivial 
refinement of f. Here, e\ = e§ = ^f\, e 2 = e^ = ^fa and = fa. A 
e-measurement can he constructed by using a f -measurement and by using 
a classical random number generator afterwards, here a die. Thus the fine- 
graining from f to e does not yield any additional information about the GPT- 
system. 
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Now we consider the Renyi entropies|28j. which are often used in information theory: 
For a probability distribution p = (pi,P 2 , •■•) the Renyi entropies are defined as: 

ffo(p) = r ^log^y P “j (9.3) 

where a g]0, oo[, a ^ 1. Furthermore, 


H 0 ( p) := Inn H a (p) 

a—40 

= log |supp(p) | 

(9.4) 

with supp(p) = {pj | pj > 0} is called the max-entropy and 


#oc(p) := lim H a ( p 

a—>oo 

) = — log max pj 
j 

(9.5) 

is called the min-entropy. Also, 



f7i(p) := lim H a ( p) = - 

OL— 

J2 Pj log pj = H{ p) 

(9.6) 


j 


is just the regular Shannon entropy H. 

For a G [0, oo] and GPTs satisfying Postulates 1 and 2, we generalize the classical 
Renyi entropies: 

H a (w) = H a { p) (9.7) 

where w = J2jPj w j is any classical decomposition. 

Following [23], we introduce the Renyi measurement entropies and Renyi decompo¬ 
sition entropies which can also be used in GPTs which do not satisfy our postulates: 
For a G [0, oo], we define the order-cc Renyi measurement entropy to be: 

H a (w)= inf H a (e 1 (w),e 2 (w),...) (9.8) 

e££* 

where H a on the right hand side denotes the classical Renyi entropy. One uses fine¬ 
grained measurements because they yield the most information. Taking the inhmum 
has two advantages: First of all, we eliminate the useless classical information caused 
by trivial refinements. Secondly, a (fine-grained) measurement with minimal entropy 
can be used to characterize a system; for example in quantum theory, particles 
prepared in a state \ij}) which all give the same energy in energy measurements 
would be said to be in an energy eigenstate. If instead we performed a position 
measurement, we would have a higher entropy, but the result of such a measurement 
is not a good characterization of the state (before the measurement). 

The order-a Renyi decomposition entropy is defined as: 

H a (w) := inf H a ( q) 

i o=2^ j Qj v j convex decomp., VjG^A pure 


(9.9) 
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This definition can be justified as follows: Assume we want to prepare a state w 
by using states of maximal knowledge (i.e. pure states) Vj and a random number 
generator, which gives output j with probability p r Thus for a device, which out¬ 
puts Vj with probability pj and is described by w, we are interested in the lowest 
information content /entropy of the random number generator necessary to build 
such a device. 

Like in [24], the measurement entropy is defined as 


H{w) 


inf H(e{w)) 

eSt* 


inf 

ee£* 


~J2 e j( w ) lo S e jM 

j 


(9.10) 


i.e. the measurement entropy is the order-1 Renyi measurement entropy. Similarly, 
the decomposition entropy is defined as 

H{w) := inf i/(q) (9.11) 

w=^2j qjVj convex decomp., Vj^^A pure 


and coincides with the order-1 Renyi decomposition entropy. 

The order-0 Renyi entropy and the order-2 Renyi entropy will have a special role in 
our discussion, so we will motivate why they are interesting from an information- 
theoretic point of view (see also Figure 9.3): 


At first we focus on the max-entropy, i.e. the order-0 Renyi entropy, i.e. Hq(w) = 
log |supp(p)| for w = Ej PjWj a classical decomposition: 

A preparation device randomly generates one of the states Wj with probability p 3 . 
Wlog, we assume the number of such Wj with p 3 ^ 0 is a power of 2. We want to ask 
yes-no-questions to determine which state is prepared. Then there exists a strategy 
which needs exactly log|supp(p)| questions, no matter what the state is and how 
unlikely it i^} No better strategy can be found which guarantees to need less steps, 
no matter how unlikely the state is: We consider only the possible states, i.e. those 
with p 3 7^ 0. There are |supp(p)| of those. We split these states into two groups 
of same size, and ask whether the state is in the first group. By that we eliminate 
one half of the states. Afterwards by the same procedure, we eliminate one half of 
the remaining states, and so on until only one remains. Thus we use log 2 |supp(p)| 
steps. Assume we ask another question, which might split the states into other 
fractions q, 1 — q. Then we might be unlucky and the state is in the larger set, thus 
we have eliminated less states than with the strategy described before. As we want 
a guarantee for the maximal number of steps needed, no matter how unlucky we are 
or unlikely the (possible) state is, log |supp(p)| is the minimum number of steps we 


8 If the number of Wj with p 3 > 0 is not a power of 2, we can still apply the same strategy 
by adding some probability-zero states until we reach a power of two. Then we find the true 
state in [log |supp(p)|] steps, and we will need more than [log |supp(p)|J steps. In that sense, 
log |supp(p) | can be seen as a continuous interpolation if the number of relevant Wj is not a 
power of 2. 
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can guarantee. 


Next, we explain the relevance of the order-2 Renyi entropy, i.e. H 2 (w ) = — log(Ej Pj) 
for w = Y^jPjWj any classical decomposition. Consider two independent sources 
which are described by w in the sense that they prepare Wj with probability pj. 
Then Ej Pj is the probability that both sources have prepared the same state. Thus 
the order-2 Renyi entropy is also called collision entropy, “collision” meaning that 
the output of both independent sources is the same. 




Figure 9.3: The left part of this figure visualizes the argument used for the max-entropy 
for 8 states. No matter how unlikely the true state is, we only need 3 questions 
to find it. The right part visualizes the “collision” of the collision entropy. 


9.2 Results 

Lemma 9.1. Consider a GPT which satisfies the postulates of classical decompos- 
ability and strong symmetry from m- Let (ei,..., e n ) be a fine-grained measurement. 
Then e 3 = Cj ( Wj , •) with c 3 G [0,1] and Wj normalized and pure. 

Proof. Let j G {1, ...,n} be arbitrary. If ej = 0, just choose c 3 = 0 and any pure 
state to be Wj. 

Otherwise, because of the self-duality, there is a w' G A + such that (w', ■) — e 3 . As 
w' 7 ^ 0 especially ua{w') 0. We can write w' = cw with w G 0^, c G M>o- 
Now assume that w is not pure. Then it has a non-trivial classical decomposition 
w = J2k=oPk v k, wlog pk g] 0 ,1], Vk pure, perfectly distinguishable and none of them 
equal to w. Especially N > 1, otherwise we had w = v 0 pure in contradiction to our 
assumption. Then e 3 = Ylk=o c ' Pk ( v k, •)■ Wlog, we only consider the case j = n 
(the other cases are included by relabelling such that the effect in consideration is 
called e n ), i.e. we have e n = T,k=o c ' Pk {vk, •)■ 

We define the measurement e! x := ei,..., e / ri _ 1 := e n _i and e' n+i := cpi{vi,-) for all 
i G {0, ...,N}. (vi,-) > 0 on all states and thus 0 < cpi{vi,-) < e n < Ua on all 
states which means the e' n+i are indeed effects. Especially JfiLo e 'n+i = e n■ We find 
4 = EL! e k + E^o c Pi(vi,-) = ELi e fc = u Ai i.e. the e \,..., e' n+N form a 
measurement. 
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We define M : {1 ,n + N} —> {1, n} by M(i) i for 1 < i < n — 1, M(i) := n 
for i > n. Then 


E b 

= Ci 

for i < n 

(9.12) 

{i\M{j)=i} 

n+N 



E b 

= E e 'j = e n 

for i = n 

(9.13) 


j=n 




Thus (e^, e' n+N ) is a fine-graining of (ei, ...,e n ). It is non-trivial, because e' n = 
cpo (vo, •) is not proportional to e n = J2k=o c ' Pk i v k,-) (just check with v 0 and V\ to 
see that this is true). This is in contradiction to the requirement that (ei,..., e n ) is a 
fine-grained measurement. Thus w is pure. As c = c(w,w) = (w',w) = efiw) < 1, 
we have c g]0,1 ]. O 


Lemma 9.2. Consider a GPT with classical decomposability and strong symmetry, 
w G Ft a arbitrary, w = Ej=i Pj w j a classical decomposition into a maximal /ram^j 
Then the measurement which perfectly distinguishes the Wj (i.e. ek(wj) = 5jk) can 
be chosen to be fine-grained. 


Proof. We consider ej := (Wj , •). As maximal frames add up to the order unit, 
they form a measurement. Now assume there is a fine-graining e' k , Jf{j\M(j)=k} e 'j = 
e k . Using self-duality: E{j| M(j)=k}^j ( w jr) = (™k,-) where c) {w' p •) = e'- with w ' 3 
normalized and c'- > 0. So Jf{j\M(j)=k} c j w j = w k■ Especially, J2{j\M(j)=k} c j = 
J2{j\M(j)=k} c'jUAiw'j) = UA(wk) = 1. Thus the c'- with M(j) = k form a probability 
distribution and E{j|M(j)=fc} c 'j w 'j — w k is a convex decomposition of a pure state. 
This requires either c'- = 0 or wl = w k - In the first case e' = 0 oc e*,, in the second 
case e' 3 = c'-e fc oc e*,. Thus the fine-graining is trivial. □ 


Lemma 9.3. Let e = (ei,...,ejv) G £* be a fine-grained measurement in a theory 
fulfilling Postulates 1 and 2. Let w G fE be a state with classical decomposition 
w = EjPjWj. Let q := ( ej(w))j be the vector of outcome probabilities. Furthermore, 
P : = (Pj)j- 

Then d < N where d is the maximal frame size, i.e. the dimension (sometimes 
denoted Na)- 

Furthermore, if p 1 := (p, 0,..., 0) is an extension of p to an N -dimensional vector 
by adding zeroes (which is always possible because of d < N), then q -< pb i.e. there 
is a bistochastic N x N-matrix M such that q = Mph 


9 This can always be obtained by extending the frame of a classical decomposition to a maximal 
frame and adding the new elements with coefficients 0 to the classical decomposition. If one 
uses a smaller frame w\,...,w n one can also use a (fine-grained) measurement ei,...,ed, which 
distinguishes a maximal frame-extension w\, ...,Wd, to distinguish the w\,..,w n . The effects 
e n +i,..., ed will always give 0. Especially, the Shannon entropy of the measurement proba bilit ies 
is still — E i=i e j( w ) l°g e j( w ) = — ELi Pj log Pj, which will be important for Theorem 


9.4 
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Proof. Let (ei, ...,ejv) be a fine-grained measurement. By our Lemma 9.1 all effects 
are of the form Cj = Cj (w'j, •) with Cj E [0,1] and w' 3 normalized and pure. Further¬ 
more Y 3 =i £j = ua as this is a measurement. Let d denote the maximal frame size. 
We define q := ei(w ) = q (w[, w). Now let us proof l c j — d\ 

As maximal frames add up to the order unit: Yfj=i v j — U A for all maximal frames 




So 


N N N N 

E c 3 = E CjUA{w' 3 ) = Y c j i w j, U A) = Y e i( u A) = u A {u A ) = ( u A , u A ) (9.14) 
j= 1 3=1 3=1 t=i 

= Y ( v P v k) = Y S ik = d (9.15) 

j,k= 1 j,k=1 

Especially, as q, < 1, this implies d < N. 

Now consider a classical decomposition of w and extend it to a maximal frame 
decomposition w = J2j=iPj w j by adding zeroes. Note that adding zeroes will not 
change p'. Define qi\ 3 := q(iCj). 


d d 

Y Qi\jPj = E e i(.Pj w i ) = dM = qi 

3=1 3=1 

d d d 

Y m j = E e i( w j) = c * E ( w 'n w j) 

3=1 3=1 3=1 

N N 

E%- = E e *K) = maK) = i 

i=l Z=1 


CiUa(w[) = Cl 


(9.16) 

(9.17) 

(9.18) 


Once more, we used that maximal frames add up to the order unit and that mea¬ 
surement effects sum up to the order unit. 

We extend p = (pi,...,pd) to a TV-component vector p' := (pi, ...,p d , 0,..., 0). Fur¬ 
thermore we define Mij := qi\j for j < d and Mij := for j > d (the latter case 
can only happen if d < N ). M is a bistochastic N x N- matrix: 


N N 

Y M 0 = E %' = 1 for j < d 

i=i i=i 

OL *l-q N — d J 

E M iJ = E a T - - 1 = -77-7 = 1 for j > d 

i=i i=i 

N d 


N-d N-d 


£Mu = £«b- + ( jv-rf) 


3=1 


3=1 


1 ~ Q 
N — d 


= c t + 1 - Ci = 1 


(9.19) 

(9.20) 

(9.21) 


Furthermore M/j > 0 as qi\ 3 = ei(wj ) > 0 and j^d > 0 by q < 1, d < N (for N = d, 
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there is no case j > d). M has the important property that q = M ■ p ; : 

N d d 

E M u p'j = E M uPi = E ® ( 9 - 22 ) 

j=i j=i j=i 

Thus M is bistochastic and by Birkhoff’s theorem, M is a statistical mixture of 
permutations: M = P v a - □ 


Theorem 9.4. Consider a GPT which satisfies classical decomposability and strong 
symmetry. Then the Renyi entropies and the R'enyi measurement entropies coincide, 

i.e. 

Hj(w ) = Hj(w ) \/w G £Ia, j G [0, oo] (9.23) 

In particular the measurement entropy is the same as the thermodynamic/spectral 
entropi Q which for a state w with classical decomposition w = Pj w j bs defined 
as S(w) = — J2jPj log Pj, be. we have H(w ) = S(w) Vu> G 


Proof. Once more for any fine-grained measurement ei,...,ejv, we set qi := efiw), 
P ; = (p, 0,0) and let M = Eaes n P a a be the bistochastic N x N matrix with 
q = M ■ p', compare Lemma 9.3 As the Shannon entropy is Schur-concave, we find 


H( q) > E P')) = E P * H ( P') = H (P’) = H (P) = -Eft'lo SPi = S ( w ) 

ctGSn ctsSat j= 1 

(9.24) 

Note that H( p) = —J/j = iPjlogpj = S(w ) is the result of a measurement that 
distinguishes the states wf ek(wj ) = Sjk- By Lemma 9.2 such a measurement can 
indeed be chosen to be fine-grained. 

For the other Renyi entropies H a , the same argument can be used because their 
classical counterpart is also Schur-concave. □ 


In the context of Postulates 1 and 2, Postulate 3 is equivalent to the covering 
property (see my- 

Definition 9.5. A GPT satisfies the covering property iff: 

For every face F and an atom a (i.e. the face given by a single pure state), the 
smallest face F V a containing F and a is either F itself or it covers F, i.e. F V a 
is a larger face than F and there is no other face in between them. 

In our context, covering property thus means: F a face of A + , w pure then the face 
G generated by F and w has rank |G| < \F\ + 1. 


10 The proofs of Theorem 9.4 for the measurement entropy together with Lemm£9.3 where found 


in a collaboration with Howard Barnum and are an adaptation of the quantum proof found in 
m , Lemma B.l 
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Lemma 9.6. Consider a GPT satisfying Postulates 1 and 2, and the covering prop¬ 
erty. 

Let w i, ...w n be pure states. Then the face F generated by them satisfies |F| < n. 
In particular, for every state in F, there is a classical decomposition using at most 
n pure states. 

Proof. We prove by using an induction: 

In the case of just a single pure state, the decomposition is unique. 

Now assume the statement is true for all sets of n or less pure states. We also want 
to show that the statement is true for all sets of n + 1 pure states: 

Let wi, ...,w n+ 1 be pure states. By the induction hypothesis, the face F generated 
by wi, ..., w n has a rank of at most n. 

By the covering property, the face F V (M > 0 • {w n+1 }) contains uq, W 2 ,---,w n+ 1 and 
has a rank of at most n + 1 , i.e. it is generated by a frame of at most n + 1 states. 
As generated faces are minimal, also the face generated by uq, may have 

no higher rank. □ 


Now we ask whether there is a convex decomposition into pure states which needs 
less trials than a frame decomposition. 

Theorem 9.7. Consider a GPT satisfying Postulates 1 and 2. 

Then Postulate 3 is equivalent to H 0 = H 0 = H 0 , i.e. in the context of Postulates 1 
and 2, Postulate 3 is true exactly if the order-0 Renyi entropies coincide. 

Proof. At first, we consider the ’’-direction: 

Assume Hq = Ho, but that the covering property is not fulfilled. Then there exist a 
face F and a pure state w, such that the face G generated by both has rank \F\ +2 or 
higher. Let uq,..., w\p\ be a frame generating the face F. Then F is also generated 

_ I zpl 

by v := F i Xq =1 w'j, i.e. the normalized projective unit. This statement is clear, 
because every face containing v also contains all the uq (and vice versa), and F is 
the smallest face for that. 

So consider now the state \w + \v. If this state had a classical decomposition using 
only |F| +1 (or less) perfectly distinguishable pure states iq,..., U|^| + i, then the face 
generated by this frame already contains v (and thus uq,..., tC|F|) and w, but it only 
has rank |F| + 1 in contradiction to our assumption. Thus the classical decomposition 
of t 2 w + \v uses at least |F| + 2 perfectly distinguishable pure states with non-zero 
coefficients, i.e. Hq{\w + |i>) > log(|F| + 2). But \w + w j + \w is a 

convex decomposition into \F\ + 1 pure states, i.e. 

Ho + ^vj < log(|F| + 1) < log(|F| + 2 ) < H 0 Qw + ^ (9.25) 

Once more, this is an contradiction to our assumption. 

Thus if Hq = Hq, then also the covering property must hold. 
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Now we proof the “^’’-direction: 

Now assume that the covering property holds. 

Assume there is a state w G with H 0 (w ) 7 ^ H 0 (w). This requires H 0 (w ) > H 0 (w), 
because the classical decompositions are also included in the inhmnm-dehnition of 
Ho. As the minimization procedure only runs over values 1 , 2 ,..., IhRtl, the inhmnm 
actually is a minimum. Thus there exist pure states W\, ...,w 2 j} o(w) such that there 
is a convex decomposition: 

2 h o(w) 

*= E Pj w j ( 9 - 26 ) 

3 = 1 


By Lemma 
convexity, i 


9.6 


the face generated by these pure states has at most rank 2 H °^ W> . By 
includes w. Thus there is a classical decomposition of w which uses a 
frame of size no larger than 2 H °( W \ Thus Hq(w) > Hq{w) is impossible. □ 


ffowever, the order-2 Renyi entropy H 2 (w) = — log(X^Pj) reduces without use of 
the third postulate, i.e. we do not need the third postulate for all Renyi entropies: 

Theorem 9.8. Consider a GPT which satisfies Postulates 1 and 2. Then for all 
states: 

H 2 (w) = H 2 (w) (9.27) 

Proof. Let w = JfjPj w j be a classical decomposition, and w = 3 qjVj any convex 

decomposition into pure states. Then Jfjp'j = ( w,w) = J2jQj + Jfj^kQjQk {vj,Vk)- 
As (vj,v k ) > 0, we find £ jqj < EjPp thus H 2 {p) = - log (EjPj) < ~log(£ j-qf). 
Thus classical decompositions indeed minimize the collision entropy. □ 
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10 Weak spectrality does not imply spectrality 

10.1 The idea 

Our notion of classical decomposability is inspired by the diagonalization of density 
operators. If a state space satisfies the postulate of classical decomposability, we 
also say that it satisfies weak spectrality. However, in contrast to the eigenvalues 
of a density operator, it is not clear that the coefficients of a classical decomposition 
are unique except for permutation and zeroes. To show this result, we had to use the 
second postulate, strong symmetry. Thus we say a state space with weak spectrality 
satisfies (unique/strong) spectralityp] if the coefficients are unique except for 
permutation and zeroes. 

We want to construct an example which shows that weak spectrality does not imply 
unique spectrality. This means that we want to find an example for a state space, 
in which: 


1 . every state has at least one classical decomposition 

2 . there is at least one state which has at least two different classical decompo¬ 
sitions whose coefficients are not just a permutation of each other 


This result shows that classical decomposability alone is not enough in order for our 
entropy to be well-defined. 

To prove this, we consider an egg-like state space 



Figure 10.1: The state space is chosen to look like a 2D egg. 


It is quite obvious that the state shown in the figure above has two different classical 
decompositions whose coefficients are not just a permutation of each other: 


(A _ 1 AA 1 / 0 \ _ r f-R\ R fr\ 

0j _ 2\rJ + 2f-rJ - r + B\ 0 j + r + flf 0 / 


( 10 . 1 ) 


n The notions “weak” and “unique spectrality” have been defined by Howard Barnum while work¬ 
ing with our group. This chapter is an answer for his question whether weak spectrality implies 
unique spectrality. 
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So it remains to show that every state in this state space has a classical decompo¬ 
sition, i.e. that weak spectrality is satisfied. 

The idea is like this: We put a tangent hyperplane (i.e. a straight tangent line) at an 
arbitrary boundary point of the egg. Afterwards, we consider a parallel hyperplane 
on the other side of the egg and move it towards the egg until it hits the egg. 



Figure 10.2: The first tangent line is chosen freely. The tangent line on the other side is 
constructed by moving a distant parallel line towards the egg. 


We call the tangent lines E\ and E 2 , and the points where they and the egg intersect 
are called p\ and p 2 . The line that connects these points is called l. Now we move 
Pi around the boundary of the egg, while keeping E\ tangential. At the same time, 
we also move p 2 around the boundary such that E 2 stays tangential and parallel to 
E\. In Appendix |D.3|, there is a visualisation with Mathematica. 




Figure 10.3: This figure shows how the tangential lines are rotated along the egg. 


The conjecture is that while doing this for full 360 degrees, every point of the egg 
lies on the connection line l at least once. If a point p is found on /, it is a convex 
combination of p x and p 2 . pi, p 2 are pure. We define effects ei, e 2 by ej(E k ) = {Sj k } 
and extend affine-linearly, as explained in Chapter [3j Example |3 .7 Thus p\ and p 2 
are perfectly distinguishable pure states and p has a classical decomposition. If the 
conjecture is true, then the egg satisfies weak spectrality. 


10.2 Proof idea 


A “topological” proof idea is shown in Figure [1(14 
tangent hyperplanes (i.e. 


Like before, we put two parallel 


straight tangent lines) at the egg. Wlog, they are chosen 
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such that the straight line through the egg connecting them coincides with the sym¬ 
metry axis of the egg-shaped state space. 




Figure 10.4: This figure visualizes an idea for a proof, that the egg space satisfies weak 
speciality. 


Again, we call the tangent lines E\ and E 2 , and the points where they and the egg 
intersect are called p\ and p 2 . The line that connects these points is called l. The 
line l splits the egg into two parts. We mark one part with +, the other with —. 
Now we move pi around the boundary of the egg, while keeping Ei tangential. At 
the same time, we also move p 2 around the boundary such that E 2 stays tangential 
and parallel to E\. We stop, when p\ coincides with the other end of the symmetry 
axis. Then the situation looks almost like in the beginning, but the +- and —parts 
are exchanged. This means that every point of the egg has changed its sign either 
from + to — or vice versa. As the line l is moved continously through the egg, this 
requires that each point of the egg lay on l at least once. This means that each point 
of the egg has at least one convex decomposition into perfectly distinguishable pure 
states (pi and p- 2 ). 


While one can expect that this proof idea will work for many strictly convex state 
spaces, we will now consider an exact proof for our egg shaped state space based on 
this idea. We will need a parametrization of our egg shaped state space. 
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10.3 Parametrization 



Figure 10.5: Detailed picture of how we parametrize the egg. 


We want to provide an analytical proof, but also check our results with Mathematica. 
According to [45] , Mathematica chooses to have arccot values between — | and 
the same for arctan [145] . 

So now we give the exact parametrization of the 2D egg: The first part is a half¬ 
circle, given by 

(x, y) = (r cos(a), r sin(a)) (10-2) 

for a G [—|, |]. The second part is an ellipse given by 

(x,y) = (—Rcos(/3),rsin(/3)) (10.3) 


for (3 G [— |, |]. Choose an arbitrary angle a. Then 


d y die 

— = r cos(a) — 

da da 


—r sin(a) 


(10.4) 


Thus the slope of the tangential line is given 12 by 4^ = — cot(a). Now we want to 


find the parallel line on the other side. At first 
d y f,Q\ 

rr rmm 


I= 


(10.5) 


Thus the slope on the other side is given by ^ — r R cot(/3). We want the slopes to 
agree, i.e. cot (/3) = — cot(a). Thus 


/3(a) = arccot 



( 10 . 6 ) 


12 We apply the chain rule here: -A 


V(<*)-V(°t q) _ y(a)-y(a 0 ) 
x(a) — x(cto) a.— c*!o 


/ 


a;(a)-a;(Q!o) 


dy I dx_ 
dcx / da 


a—ao 
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The line connecting pi and P 2 is 


L(p) =P 


(r cos(ck) 
\r sin(a) 


+ (1 -P) 


(—Rcos((3(a))\ 
y rsin(/3(a)) J 


(10.7) 


where p G [0,1]. 


To prove weak spectrality, we have to show that every point of the egg lies on 
one of the l a . Based on our parametrization, we can implement this with Mathe- 
matica, see Appendix |D.l In Appendix D.2, the same is done for an alternative 


parametrization. For both parametrizations, we first plot the boundary of the egg 
space. Afterwards, we plot all l a (p), i.e. all points that have at least one classi¬ 
cal decomposition. As one can see, Mathematica suggests that all points have a 
classical decomposition. Thus weak spectrality holds, but not unique spectrality as 
already argued in the beginning. Take a moment to appreciate that by choosing 
a,(3 E [—7t/2, 7t/ 2], we have chosen the same conventions as Mathematica. 


10.4 Exact proof 

Now, we show an analytical proof that does not need Mathematica: 

So far, we have found 

; „ /Vcos(a:)\ , . f-Rcos{P(a))\ ( . 

Wp) - p (r S m(a)J +(1_p) ( rsin(/?(a)) ) (10 ' 8) 

_ f-Rcos({3(a))\ , (r cos(ct) + i2cos(/3(a))\ nnai 

y rsin(/3(a)) J ^ y r sin(a) — r sin(/3(a)) J ■ a P a \ ■ ) 


where 


13(a) = arccot 



( 10 . 10 ) 


t a gives the direction of the straight line l a . A vector perpendicular to l a is thus 
given by: 


—r sin(a) + r sin(/3(a)) 
rcos(a) + Rcos((3(a)) 


( 10 . 11 ) 


While intuitively clear, we want to prove that {3(a) is well-defined and continous. 
Figures 10.6 and 10. 7| show, that the only critical arguments are a = 0 (here cot 
diverges) and a = ±| (here arccot (0)). 


hm/3(a) 

ct—s>0 

= arccot (=Fc>o) = 0 

(10.12) 

lim {3(a) 

i • / \ 7T 

= lim arccoti— x) =- 

(10.13) 


a;\0 2 

lim /3(a) 

, / \ 7r 
= Inn arccoti— x ) = + — 

(10.14) 

,\—2L 
\ 2 

x/'o 2 
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In[l]:= Plot[ArcCot[x], {x, -5, S>] 




Figure 10.7: The arccot function as cho- 
Figure 10.6: The cot-function has ze- sen by Mathematica (\f5^ ) 

roes for and diverges in and us is discontinous in 

0. (fromJiJItf) 0. For infinite argument, 

arccot approaches 0. 

Thus, as expected, for a = 0 we find 6 — 0, while for a = ±| we find f3 = =f§. Note 
that we used that a G [— |, |]. Thus /3(a) is well-defined (by continuous extension) 
and continuous. 

Now let w G hlyi be an arbitrary state. Then: 


w G l a G [0,1] : w = a a + p ■ t a <3- 3p G [0,1] : w — a a — p ■ t c 




w 


■ n a = 0 


(10.15) 

(10.16) 


In the last equivalence, for we multiplicc 


we used that riU are an orthogonal basiQ 


with n a and used n a ■ t a — 0. For <£=, 


thus 


w 


n r 


= 0 implies that 


w — a a oc t a . As w is a state 14 , the proportionality constant has to be G [0,1]. 
Now consider an arbitrary state w G We define the function 


7T 7T 
2 ’ 2 


-A 


9w(&) 


r _ i 


_ (—Rcos(j3(a))\ 

w—cT a 

■n a = 

W ~ [ r siu(/3(a)) J 


f—r sin(a) + r sin( / S(a)) 
l rcos(a) + Rcos(/3(a)) 


(10.17) 


(jw is a continuous function. Now we compare the situations for a — ±|. 
For a = ±|, we find f3 = =F| and thus J. Therefore: 


13 As l a was constructed from two opposing points in the egg and t a is the vector connecting these 
points, t a 0 Va. Except for sign and exchange, n"7 has the same components and thus also 
never vanishes. 

l H a connects two opposing boundary points given for p = 0 and p = 1. Thus for p ^ [0,1], 
one leaves the egg. Also note, that the condition p £ [0,1] in these equivalences is not really 
necessary. It is sufficient to check if w is on the straight line given by l a , p £ [0,1] follows 
because w is a state. 
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Figure 10.8: This figure visualises the connection between the exact analytical proof and 
the proof idea presented before. 


Thus, if not already g t r, (+f) = 0, changes its sign. In that case, by the interme¬ 
diate value theorem , there is an a £ [— |, |] with g^{a) = O.Thus in all cases, there 
is a a with w — a a ■ n a — 0, i.e. w £ l a . This means that w has a classical de¬ 
composition. As w £ VLa was arbitrary, weak spectrality/classical decomposability 
holds. 


10.5 Conclusion 

We have analysed a state space which has a 2D egg-shape. Each state in this state 
space has a classical decomposition (i.e. weak spectrality holds). We used Mathe- 
matica and an analytical proof (based on an idea that should also work for other 
strictly convex state spaces) to show that weak spectrality holds. But there is at 
least one state, which has classical decompositions that differ in their coefficients 
(i.e. unique spectrality does not hold). This has the important thermodynamic 
consequence, that an entropy similar to the von Neumann entropy would not be 
well-defined for a state, because its value depends on the choice of classical decom¬ 
position. Especially, the state would not fully characterize the ensemble, i.e. “the 
state does not describe the state of the system”. Furthermore if the ensemble is not 
realized by perfectly distinguishable pure states, in general we cannot not define the 
entropy at all. In contrast, the postulate of (unique) spectrality directly leads to 
the generalization of the von Neumann entropy. In fact, a postulate or result of the 
form of unique spectrality is needed to know what the eigenvalues and orthonor¬ 
mal eigenvectors can be replaced with to formulate the generalized von Neumann 
entropy. Thus it is an interesting question to investigate what other principles or 
postulates lead to unique spectrality. 
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11 Pfister’s state discrimination principle 


This section has no direct relevance for entropy or thermodynamics. However, the 
strong structure provided by our postulates also fulfils other physically motivated 
principles/postulates. One example is Pfister’s state discrimination principle |T3] : 
To illustrate its significance, we start with a seemingly trivial example: We can 
perfectly distinguish hats and T-shirts. Among the T-shirts, we can distinguish 
blue T-shirts from black T-shirts. We can also distinguish hats, blue T-shirt and 
black T-shirt from each other. What seems quite trivial, is not necessarily true in 
all GPTs. We assume we wish to analyze an object which is either a GPT-hat or 
a blue or black GPT-T-shirt. If we ask Is it a GPT-hat? it may happen that the 
object is destroyed or collapses into another object. Then in case of a GPT-T-shirt, 
asking Is it blue or black? might not be possible or might give the wrong answer. 
Thus Pfister postulated that this will not happen. The basic idea is in the spirit 
of Specker’s principle: “Do you know what, according to me, is the fundamental 
theorem of quantum mechanics? (. . . ) That is, if you have several questions and 
you can answer any two of them, then you can also answer all three of them’’ (see 
e.g. [10] for more information and references). 

Theorem 11.1. Consider a GPT which satisfies Postulate 1 (Classical Decompos- 
ability) and Postulate 2 (Strong Symmetry) from fJ77[/ . 

Then A fulfils Pfister’ s ffflf Postulate 3, the state discrimination principle: 

Let Bi,B 2 C 12a be perfectly distinguishable sets of states. Assume that in addition, 
there are subsets B 3 ,B A C B 2 such that B 3 is perfectly distinguishable from B 4 . 
Then B\,B 3 ,B± are perfectly distinguishable. 


Proof. By definition of perfect distinguishability, there are effects ei,e 2 such that 
ek{Bj ) = {5jk} for j,k = 1,2 and effects e 3 ,e 4 such that e^Bf) = {<5^} for j, k = 


3,4. Especially, Bj lies in the faces 1 (1) and e fc 1 (0) for {k,j} = {1,2} or {k,j} = 

Let E, be the minimal 


{3,4} (interpreted as subsets of 12a), compare Lemma 3.18 

face, which contains Bj. Then especially Ej C e} 1 (1), e^T 1 (0) for {k,j} = {1,2} or 
{k,j} = {3,4}. 

,(J) 


There are frames w[ J> ,..., that generate Ej. 

There is a 1-1 correspondence between the faces of 12 a and A + , especially M> 0 -eJ 1 (l), 
M> 0 • e^T 1 (0), R>p Ej are the corresponding faces of A + . 


By Proposition 
By Proposition 


3.17 


TTT 


M> 0 • Ej is also generated by the frame w[ J \ 
the projective unit 


,w 


U) 

\ E iV 


I P\ 

£ ( W a ] r) = U \R> 0 -Ei] =■ Uj 

a —1 


is an effect with upR^.^^) = 1 for all w E (M> 0 ■ E^j ft 12 a = Ej and 0 < u^ >0 .Ej] < 
ua■ 
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By e\ and e 2 , B\ and B 3 C B 2 are perfectly distinguishable. As E 3 and E 3 C E 2 
are the minimal faces which contain B\ or B 3 C B 2 , we find E\ C eb 1 (l),e^" 1 (0) 
and E 3 C e 2 1 (1), eb 1 (0). By perfect distinguishability thus (w^\w^ 3 ) = 0. By the 
same reasoning (wp\wP) = 0. Furthermore as the wP form frames for fixed j, 
(wP,w ( p) = S ab . 

Furthermore, B 3 and B 4 are perfectly distinguishable by e 3 and e 4 . As E 3 and 
E 4 are the minimal faces which contain B 3 ,B 4 we find E 3 C e 3 1 (1), e 4 1 (0) and 
E 4 C e 4 l (l), e 3 1 (0). Thus E 3 and E 4 are perfectly distinguishable by e 3 and e 4 and 
we find {w@\wjp) = 0. 

So in total, we found: (wP.wP) = for j, k G {1,3,4}. Therefore {wp \j G 

{1,3,4}} is a frame and ^= 1 , 3 , 4 % = Ej=i, 3 , 4 Eo ( w a\ ■> < u A . 

The faces E 4 , E 3 , E 4 are orthogonal to each other because they are pairwise perfectly 
distinguishable. Thus Uj(E k ) = 0 for j p k. (Alternative reason: Y^j=i, 3 A u j — U A 
and thus Uj(Ej) = 1 implies upEj) = Sj^.) 

In total we found Uj{Bp = {Sjk} for j,k G {1,3,4}. The Uj are effects whose 
sum is no larger that u A . Thus the Uj perfectly distinguish B 4 , B 3 ,B 4 . □ 



12 Conclusion and outlook 


In this thesis, we have explored the thermodynamic consequences of two important 
postulates. These two postulates are the condensed form of important structural 
properties of quantum theory and classical theory. The first postulate is that every 
state belongs to a classical subspace, non-classical behaviour only possible because 
there might be different classical subspaces. The second postulate is the computa¬ 
tional equivalence of all n-level systems. 

Following a suggestion by J. Barrett, we adapted a thought experiment by von 
Neumann and an expansion thereof to derive a thermodynamic entropy for GPTs 
fulfilling the two postulates. This entropy is a direct analogue of the von Neumann 
entropy. We proved that this entropy is well-defined and we showed how it behaves 
for decomposition into perfectly distinguishable states. Furthermore, we constructed 
projective measurements and proved the second law for such measurements as well 
as mixing processes. In that context, we also generalized observables to our GPTs. 
We showed that the information-theoretic measurement entropy coincides with our 
entropy and generalized this result to other Renyi entropies. Furthermore, we inves¬ 
tigated the Renyi decomposition entropies in the context of third order interference. 
Then we used an egg-shaped state space to show that the first postulate alone 
does not imply the second and thus does not always lead to a well-defined entropy. 
Furthermore, we showed that our GPTs satisfy the state discrimination principle 
formulated by Pfister. 

There are many ways how to build upon the results of this thesis: 

An interesting question is, whether the third postulate (no 3rd order interference) 
already is an consequence of the first two postulates or whether it is independent. If 
it is a consequence, then the GPTs considered by us all have the important property 
that they do no exhibit non-trivial 3rd order interference. If the third postulate 
does not follow from the first two, then there might be many more GPTs for which 
our results work. In particular, Postulates 1,2,4 provide entropy and energy, and 
thus allow to define the free energy and perform equilibrium thermodynamics. So if 
Postulate 3 really is necessary to obtain quantum theory, then we could define equi¬ 
librium thermodynamics for some non-classical and non-quantum systems. Also, we 
have seen that the third postulate is equivalent to the result, that the decomposition 
max-entropy is identical with the spectral definition of the max-entropy. Investigat¬ 
ing this equivalence might be a way to find out whether the third postulate is 
independent of the first two postulates. In the same way, it would be interesting to 
analyze whether the decomposition entropy agrees with the spectral entropy, and 
especially whether we need the third postulate for this identity. Furthermore, it is an 
interesting question whether the first two postulates can be replaced by weaker pos¬ 
tulates. For example, using unique spectrality leads to a well-defined generalization 
of the von Neumann entropy. As projective measurements play an important role, 
it would be interesting to check what consequences projective state spaces would 
have. Also one should try to consider infinite-dimensional GPTs. 



References 


91 


References 

[1] N. D. Mermin, Could Feynman have said this?, Physics Today 57(5), 10 (2004) 

[2] M. Jammer, The Philosophy of Quantum Mechanics: The Interpretations of 
QM in historical perspective , John Wiley and Sons (1974) 

[3] G. Jaeger, Entanglement, Information, and the Interpretation of Quantum 
Mechanics , Springer-Verlag Berlin Heidelberg (2009) 

[4] J. S. Bell, On the Einstein-Podolsky-Rosen paradox, Physics 1, 195-200 (1964), 
reprinted in J. S. Bell, Speakable and Unspeakable in Quantum Mechanics, 
Cambridge University Press, Cambridge (1987) 

[5] J. von Neumann, Mathematische Grundlagen der Quantenmechanik, Springer, 
Berlin, 1932 

english translation: J. von Neumann, Mathematical Foundations of Quantum 
Mechanics, Princeton University Press (1955), translation by R. T. Beyer 

[6] J.A. Wheeler, Recent thinking about the nature of the physical world: It from 
bit, Annals of the New York Academy of Sciences, 655(l):349-364, 1992 

[7] Jonathan Barrett, talk at “ Fundamentals of Physics and Information ” work¬ 
shop at ETH Zurich (2010) 

[8] G. Birkhoff, J. von Neumann, The logic of quantum mechanics, The Annals 
of Mathematics 37(4), 823-843 (1936) 

[9] G. W. Mackey, The mathematical foundations of quantum mechanics, W.A. 
Benjamin Inc, New York, 1963 

[10] H. Barnum, M. P. Muller, and C. Ududec, Higher-order interference and 
single-system postulates characterizing quantum theory, New J. Phys. 16, 
123029 (2014), arXiv:1403.4147v4 

[11] M. P. Muller and C. Ududec, The Structure of reversible computation deter¬ 
mines the self-duality of quantum theory, Phys. Rev. Lett. 108, 130401 (2012), 
arXiv:1110.3516v2 

[12] Cozmin Ududec, Perspectives on the Formalism of Quantum Theory, PhD 
Thesis, University of Waterloo, 2012 , University of Waterloo Library 

[13] Corsin Pfister, One simple postulate implies that every polytopic state space is 
classical, Master Thesis, ETH Zurich, arXiv:1203.5622vl 

[14] J. Barrett, Information processing in generalized probabilistic theories, Phys. 
Rev. A 75, 032304 (2007), arXiv:quant-ph/0508211v3 

[15] R. Webster, Convexity, Oxford University Press, New York, 1994 






References 


92 


[16] Dirk Werner, Funktionalanalysis, 7., korregierte nnd erweiterte Auflage, 
Springer Berlin Heidelberg (2011) 

[17] C. D. Aliprantis, R. Tonrky, Cones and Duality, American Mathematical So¬ 
ciety (2007) 

[18] L. Hardy, Quantum Theory From Five Reasonable Axioms , arXiv:quant- 
ph/0101012 

[19] C. A. Fuchs, Quantum Mechanics as Quantum Information (and only a little 
more), in Quantum Theory: Reconstruction of Foundations, A. Khrenikov 
(ed.), Vaxjo University Press (2002), arXiv:quant-ph/0205039 

[20] LI. Masanes, M. P. Muller, A derivation of quantum theory from physical 
requirements, New J. Phys. 13, 063001 (2011), arXiv: 1004.1483 

[21] G. Chiribella, G.M. D’Ariano, P. Perinotti, Informational derivation of Quan¬ 
tum Theory, Phys. Rev. A 84, 012311 (2011), arXiv: 1011.6451 

[22] D. Petz, Entropy, von Neumann and the von Neumann entropy in John von 
Neumann and the Foundations of Quantum Physics, eds. M. Redei and M. 
Stoltzner, Kluwer, 2001, arXiv:math-ph/0102013vl 

[23] M. Nielsen, I. Chuang, Quantum Computation and Quantum Information, 
Cambridge University Press, 10th Anniversary edition published 2010, 6th 
printing 2014 

[24] A. Short and S. Wehner, Entropy in general physical theories, New Journal of 
Physics 12 (2010) 033023, arXiv:0909.4801 

[25] H. Baehr, S. Kabelac, Thermodynamik Springer Berlin Heidelberg New York, 
13. Auflage (2006) 

[26] H. Barnum, J. Barrett, L. O. Clark, M. Leifer, R. Spekkens, N. Stepanik, A. 
Wilce, R. Wilke, Entropy and information Causality in general probabilistic 
theories, New Journal of Physics 12 (2010) 033024, arXiv:0909.5075 

[27] W. Schneider, S. Haas, Repetitorium Thermodynamik, R. Oldenburg Verlag, 
2 ., iiberarbeitete Auflage, 2004 

[28] Renyi, Alfred (1961) On measures of information and entropy, Proceedings of 
the fourth Berkeley Symposium on Mathematics, Statistics and Probability 
1960. pp. 547-561 

[29] A. Peres, Quantum Theory: Concepts and Methods, Kluwer Academic Pub¬ 
lishers, Volume 72, 2002 

[30] R. D. Sorkin, Quantum mechanics as quantum measure theory, Mod. Phys. 
Lett. A 9, 3119-3128 (1994), arXiv: gr-qc/9401003 









References 


93 


[31] Howard Barnum, Markus Muller, private communication 

[32] A. Marshall, I. Olkin, B. Arnold Inequalities: Theory of Majorization and its 
Applications , Second Edition, Springer Series in Statistics 

[33] M. P. Muller and LI. Masanes, Three-dimensionality of space and the quan¬ 
tum bit: an information-theoretic approach, New J. Phys. 15 , 053040 (2013), 
arXiv:1206.0630v4 

[34] W. Nolting, Grundkurs Theoretische Physik 6 - Statistische Physik, 7. Auflage, 
Springer-Verlag Berlin Heidelberg 

[35] W. Nolting, Grundkurs Theoretische Physik f - Spezielle Relativitatstheorie, 
Thermodynamik , 8. Auflage, Springer-Verlag Berlin Heidelberg 

[36] F. Schwabl, Statistische Mechanik, 3. Auflage, Springer Berlin Heidelberg New 
York (2006) 

[37] Daijiro Yoshioka, Statistical Physics- An Introduction , Springer Berlin Heidel¬ 
berg 2007 

[38] Lee Smolin, Three Roads To Quantum Gravity , Basic Books, A Member of the 
Perseus Books Group, 2001 

[39] C. A. Hein, Entropy in operational statistics and quantum logic , Found. Phys. 
9 751-786 (1979) 

[40] A. Cabcllo, Specker’s fundamental principle of quantum mechanics, 
arXiv:1212.1756 

[41] G. Kimura, K. Nuida, H. Imai, Distinguishability measures and en¬ 
tropies for general probabilistic theories, Rep. Math. Phys. 66, 175 (2010), 
arXiv:0910.0994 

[42] H.-P. Breuer, F. Petruccione, The Theory of Open Quantum System, Oxford 
University Press (2002) 

[43] E. Joos, H.D. Zeh, C.Kiefer, D. Giulini, J. Kupsch, I.-O. Stamatescu, Deco¬ 
herence and the Appearance of a Classical World in Quantum Theory, Second 
Edition (2003), Springer-Verlag Berlin Heidelberg New York 

[44] http://minecraft.gamepedia.com/Logic_circuit (from 17.01.2015) 
http://minecraft.gamepedia.com/Redstone_circuit (from 17.01.2015) 
http://minecraft.gamepedia.com/Tutorials/Advanced redstone circuits 
(from 17.01.2015) 

[45] http://reference.wolfram.com/language/ref/ArcCot.html (from 27.02.2015), 
http://reference.wolfram.com/language/ref/ArcTan.html (from 27.02.2015), 
http://reference.wolfram.com/language/ref/Cot.html (from 27.02.2015) 



94 


A Appendix: Measurement effects can be extended 
to linear functions 

In this appendix, we consider convex-linear effects only defined on and show that 
that they can be extended to linear functions on A. The proof can also be found in 
ra and m- It is included such that the introduction to GPTs is complete. 

For any state w in A + , there is a normalized state v and some p > 0 with w = pv. 
Except for the 0-state, we find p = u A (w), v = he. the decomposition is 

unique (except for the 0-state). Thus it is well-defined to set e(w) = pe(v). This 
also is well-defined for the 0-state, e(0) = 0. This definition agrees with our inter¬ 
pretation of subnormalized states: e(w) = u A (w)e( u ^ w ^ ) is the probability that w 
is successfully prepared times the probability that e is triggered, assuming that the 
preparation is successful. Especially, the impossible state never triggers e. Thus we 
find e(pw) = pe(w ) for all p G M> 0 , w G A + . Furthermore let Ej Pj w j with p 3 G M> 0 , 
Wj G A + . Wlog, pj 7 ^ 0 and Wj ^ 0 as the zero state will cause no problems for the 
linearity. Then: 


e 



J2PaU A (,Wa)e 

a 



UA(Wj)Pj Wj 
J2kPkUA(Wk) U A (Wj ) 


J2Pa u A(Wa)J2 

a i 


U A (Wj)Pj 

T,kPkU A (w k ) 


Wj 


U A (Wj) 


Y,pA w j) 

3 


(A.l) 

(A. 2 ) 

(A.3) 


For the first “=", we used e(pw) = pe(w) for all p G M> 0 , w G A + and that cones are 
closed under sums and multiplication with non-negative numbers. For the second 
“=”, we used that e is assumed to be convex-linear on 14^- For the third “=” we 
used e(pw) = pe(w) once more. 

Now, let w = ZjPjWj be an arbitrary linear combination with w,Wj G A + . Let 
M + be the set with p 3 > 0, M_ the set with pj < 0. Then w + J2jeM- \pj\ w j = 
\pj\ w j are positive linear combinations and thus e(w) + \Pj\ e ( w j) = 

Ej G m+ \PjH w j) g ives e{w ) = E jPje(wj). 

As the cone is generating, there is a basis Vi,...,v n G A + . We extend to A by 
e(E j QjVj ) = E j Qj e ( v j) f° r a H rea l Qj■ This definition does not depend on the choice 
of basis, as e already is linear on A + . 
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B Appendix: Correspondence between the faces of 

Qa and A + 


In this appendix we prove that there is a bijective correspondence between the faces 
of Qa and the faces of A + . 


Proposition B.l. For an abstract state space, a face F of fi a induces a face M> 0 • F 
of A + . Vice versa, a face F A {0} of A + induces a face HaLiF of Qa- Furthermore, 
if w i, ...w m G Ha generate the face F o/Ha (A + ), they also generate the correspond¬ 
ing face of A + (VLa)- Furthermore for a face F C Ha, Ha H (M> 0 • F) = F, and vice 
versa for a face G C A + , G A {0} 7 we find G = M> 0 • (Ha FI G). 

Proof. Let F be a face of Ha- We show that F + := M>o ■ F is a face of A + : 

Let v G F + . 0 G F + trivially because F is a face and thus F is not empty. So 
assume r /0. Then G F: By definition, there is a v' G F C Ha and a A G M>o 
with v = Xv'. Thus Ua(v) = Xua(v') = X, therefore v' = Thus by definition 

M> 0 • v C F + . Consider now w G A + , w = pw\ + (1 — p)w 2 with Wi,w 2 G F + and 
p G (0,1). Wlog we can assume W\ A 0 A w 2 because we already know that all 
non-negative multiples are contained in F + . Because of the normalization, this also 
implies w A 0. Thus: 

w _ pu A {w 1 ) W\ | (l-p)u A {w 2 ) w 2 

U A (w) U A (w) Ua{w 1 ) U A (w) U A (w 2 ) 

Because of proper normalization and positivity, the coefficients pM A 1 A' > and pUA \4 

1 1 1 J 7 ua{'ip) U A\ W ) 

form a probability distribution. Furthermore, W X G F just like before. As 

r J 1 UAKfWl) 1 UA\VU2) J 

F is convex, — u i , G F. Thus w G F + , i.e. F + = M>o ■ F is convex. 

For w G F + , assume w = pw i + (1 — p)w 2 with w\,w 2 G A + and p G (0,1). If 
ua(w) = 0, then w = 0 because of strict positivity of ua■ As ua(wi, 2 ) > 0, thus 
ua(w 1 , 2 ) = 0 and therefore Wi t2 = 0, again because of strict positivity. If ua(w) A 0, 
then w' := u ™r w ) G F. If Ua(w 2 ) = 0 (or analogously, Ua{w 1 ) = 0); then w = pw 1 . 
Then w\ = G M> 0 P because w G M> 0 P. So assume now ua{w 2 ) A 0 and 
ua{w\) A 0. Then 


w pu A (w 1 ) W\ (l-p)u A (w 2 ) w 2 


U A (w) U A (w) u A (wi) 


+ 


Ua(W 


ua(w 2 ) 


G F 


(B.2) 


As all the states are properly normalized and p, 1 — p, ua(ui), u a(wi), Ua(w 2 ) > 0, 


we find 1 > ^44 > 0, ^44 + {1 ~ p)u , A 4 =1- As F is a face of fi A , 

Ua{W) ’ Ua{VU) II. a I'M) 

G F. Therefore w 1 ,w 2 G 


and 


ua(w 2 ) 


U A {w) 

R>o ■ F. Thus F + is a face. 


ua(w 1 ) 


G F 


Now let F A {0} be a face of A + . We now show that G := F fl Ha is a face 
of Ha: 

G is not empty, because there is a v & F with v A 0, i.e. ua(v ) > 0. Thus yypjy G G 
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3.16 


by Lemma 

Let u>i, w 2 G G, p G (0,1). Then w\,w 2 G F and as F is convex, pwi + (l— p)w 2 G F. 
By proper normalization and as properly normalized states of A + are elements of 
fin, pwi + (1 - P)W 2 G G. 

Let w G G, p G (0,1), Wi,W 2 G fin with w = pw\ + (1 — p)w 2 . By definition, also 
w G F. As F is a face, wi,w 2 G F. As w\ i2 G fin, hr total wi,w 2 G fin fl F = G. 
Thus G is a face. 


Now assume that the face F C fin is generated by Then the minimal 

face 

G:= n H 

HdA+ face, w\ 

of A .|_ containing is a subset of the face M>o ■ F, i.e. G C M>o • F. If 

G 7 ^ R>o • F, then G M>o • F but w ^ G. As all faces contain the 0 and «n is 
strictly positive, u A (w) 7 ^ 0. The face G fl fin contains ..., u> m . But G F is 
not found in Gflfln, which is a contradiction to F being minimal as a generated face. 


Now assume that the face F C A + is generated by u >\,..., w m G fin- Then w i, ...,w m 
are also found in the face flnnF C fin- Now consider the minimal face G C fin con¬ 
taining the Wi, By definition, G C Fflfln- If G ^ Fflfln, then 3w G Fflfln, 

but w (ji G. The face M>o ■ G also contains the W\,w m , but not w. This is a con¬ 
tradiction to F being minimal. 


Now consider a face F C fin- We have F C (M> 0 ■ F) and F C fin, thus 
F C fin f) (R>o ■ F). For w G fin f (M> 0 ■ F), we have w G fin and 3k/ G F, 
p G M>o with w = pw'. By w G fin, we find p — 1 because of normalization, thus 
w G F and in total fin fl (M>o • F) = F. 


>o 


Now consider a face G C A + with G 7 ^ {0}. We have (fin fl G) C G and thus 

(fin D G) C G. Vice versa for w £ G: If w 7 ^ 0, then 

H> 0 fln and proper normalization. Thus yyyyy G fin fl G by 

Lemma 3.16 Thus w G F> 0 • (fin f G). As G ^ {0}, 3k/ G G : u A (w') > 0. 

G G fl fin, Thus 0 G F>o • (fin H G). In total, G = F>o • (fin f G). □ 


by Lemma 3.16 
7 s, G fin by A + = 

UA\W) J ' 


UA 
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C Appendix: Entropy decreasing SWAP operation 

In this appendix we explain the formal details of a quantum operation which replaces 
an arbitrary incoming quantum system by a pure state. 

We consider two n-levcl Hilbert spaces % a = Tib with orthonormal base |1) ,| n). 
We define the linear map U by U{\j) a ® \k) b ) := | k) a 0 \j) b and extend linearly. This 
map is often called SWAP-operation. It is unitary: For |^) = J2j,kPjk\j) a 8> \k) b , 
\4>) = Ej,kQjk I j) a ® | k) b we find: 

(U • ll>\U •</>)= J2 P*jkQrt({k\ a 0 {j\ b ) ( | t) a 0 |r) 6 ) = 51 PjkQrtSktSj r 

jk,rt jk,rt 

= X>ifc0rt(O'lo® (k\ b )(\r) a ® |*} 6 ) = (V#> 

jk,rt 

In particular, we find: 

jk jk 

= Edifc(|l) a ® b')6)(( 1 la® ( k \b) 

jk 

= I 1 ) (ila® {Y.Pik\j) ( k \ b ) 

jk 

For the density operator p a on % a we define the map: 

T(k) ^Tr^C/p.SllXlljfyt) 

In this form it is clear that this is a quantum transformation. The physical imple¬ 
mentation is also clear: One starts with a n-lcvcl system in the state p a . Then one 
creates another n-level system (of the same physical realization) initialized in the 
state 11). Then one exchanges the label of the two systems and throws away the origi¬ 
nal system, keeping only the system initialized to |1). This is clearly a experimentally 
possible transformation. But as the original system is simply forgotten and we just 
create any system of the same physical implementation in any state we like, it is clear 
that the transformation can decrease the entropy: T(^ Xq | j) (j\ a ) = |1) (l| a , thus 

S(k Si U> 01) = - Si 1 In (i) = ln(«) > 0 = S(|l> (it) = S(T( 1 Ei li> 01)). 

We note that this transformation does not change the normalization, thus it sim¬ 
ply induces the order unit as measurement. Thus in most generality, there will be 
operations that can decrease entropy. 
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D Appendix: Mathematica shows that weak 
spectrality in the 2D egg-shape holds 

D.l Appendix: States with classical decomposition 

ln[11] s(*Our main parametrization*) 

r = 1 
R = 3 

ParametricPlot [{r * {Cos [a] , Sin[a]}, {-R*Cos[b], r*Sin[b]}}, {b, -Pi/2, Pi/2}, 
{a, -Pi/2, Pi/2}] (*This plot shows how the state space looks*) 

b[a_] : = ArcCot [-(R / r) * Cot [a] ] 

(*a is short for alpha, b for beta. For a tangent line at angle a, 
b gives the angle of the parallel tangent line at the other side*) 

ParametricPlot [p * r * {Cos [a], Sin[a]}+ (1-p) * {-R* Cos [b [a] ] , r * Sin [b [a] ] } , 

{p, 0, 1}, {a, -Pi/2, Pi/2}] (*This plot shows all the points that 
have a classical decomposition into at most two pure states *) 

Outtl 1 ]i 

Outt12]3 

1.0 

0.5 

Oultl 5]=°'° 

- 0.5 

- 1.0 

1.0 

0.5 

Outtl 5>°° 

- 0.5 

- 1.0 



-3 


-2 


-1 


0 


1 
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D.2 Appendix: Alternative parametrization 



Figure D.l: Detailed picture of how we parametrize the egg. 


So now we give an alternative parametrization of the 2D egg: The first part is a 
half-circle, given by 


2 ’ 2 J 


2 ’ 2 J 


{x,y) = 

(rsin(a), — rcos(a)) 

(D.l) 

The second part is 

an ellipse given by 


(x,y) = 

(r sin(/3), Rcos(/3)) 

(D.2) 

Choose an arbitrary angle a. Then 


d V • / x 

— = r sm (cc) 
da 

da; 

— = r costa) 
da 

(D.3) 


Thus the slope of the tangent line is given by = tan(a). Now we want to find 
the parallel tangent line on the other side. At first 

% = ~ Rs " m 


!=rco S (/?) 


(D-4) 


Thus the slope on the other side is given by ^ = — ^tan(/3). We want the slopes 
to agree, i.e. — - tan(/3) = tan(a). Thus 

/3(a) = arctan — tana^ 

The line connecting p\ and p 2 is 

r sin(/3(a)) \ 


(D.5) 


Lip) = p 


r sm^aj 
—r cos (a) 


+ (1 ~P) 


Rcos(/3(a)) 


(D.6) 


where p G [0,1]. The next page uses Mathematica to show that weak spectrality / 
classical decomposability holds using this parametrization. 



In[21]:=(*0ur alternative parametrization*) 


r = 1 
R = 3 

ParametricPlot [ {r * {Sin [a] , - Cos [a]}, {r*Sin[b], R* Cos [b]}}, {b, - Pi/2, Pi/2}, 
{a, - Pi/2, Pi/2}] (*This plot shows how the state space looks*) 

b[a_] : = ArcTan[-(r/R) * Tan[a]] 

(*a is short for alpha, b for beta. For a tangent line at angle a, 
b gives the angle of the parallel tangent line at the other side*) 

ParametricPlot [p*r*{Sin[a], - Cos [a]}+ (1-p) * {r * Sin [b [a] ] , R* Cos [b [a] ] } , 

{p, 0, 1}, {a, -Pi/2, Pi/2}] (*This plot shows all the points that 
have a classical decomposition into at most two pure states *) 

0utt21]i 



- 1.0 - 0.5 0.0 0.5 1.0 
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D.3 Appendix: Visualization of the tangent lines with 
Mathematica 

ln[1053:r = 1 
R = 3 

b [a_] : = ArcCot [- (R / r) * Cot [a] ] 
w = - 0.49 Pi 

ParametricPlot[{r * {Cos[a] , Sin[a]}, {-R * Cos[a], r * Sin[a]}, 
p * r * {Cos [w], Sin[w]}+ (1-p) * {-R* Cos [b [w] ] , r * Sin [b [w] ] } , 
r * {Cos [w] , Sin [w] } + 2 * (0.5-p) * (1/ Sqrt[l + (Cot [w] ) A2]) * {1, -Cot [w] } , 

{-R* Cos [b[w] ] , r * Sin [b [w] ] } + 2 * (0.5-p) * (1/ Sqrt [1 + ( (r / R) Cot [b[w] ]) A 2] ) * 
{1, (r/R) Cot[b[w]]}}, {p, 0, 1}, {a, -Pi/2, Pi/2}] 
w = - 0.45 Pi 

ParametricPlot[{r * {Cos[a]. Sin[a]}, {-R* Cos[a], r * Sin[a]}, 
p * r * {Cos [w] , Sin[w]}+ (1-p) * {-R* Cos [b [w] ] , r * Sin [b [w] ] } , 
r * {Cos [w] , Sin [w] } + 2 * (0.5-p) * (1/ Sqrt[l + (Cot [w] ) A2]) * {1, -Cot [w] } , 

{-R* Cos [b[w] ] , r * Sin [b [w] ] } + 2 * (0.5-p) * (1/ Sqrt [1 + ( (r / R) Cot [b[w] ]) A 2] ) * 
{1, (r/R) Cot [b [w] ] }} , {p, 0, 1}, {a, -Pi/2, Pi/2}] 
w = - 0.35 Pi 

ParametricPlot[{r * {Cos[a]. Sin[a]}, {-R* Cos[a], r * Sin[a]}, 
p * r * {Cos [w] , Sin[w]}+ (1-p) * {-R* Cos [b [w] ] , r * Sin [b [w] ] } , 
r * {Cos [w] , Sin [w] } + 2 * (0.5-p) * (1/ Sqrt[l + (Cot [w] ) A 2]) * {1, -Cot [w] } , 

{-R* Cos [b[w] ] , r * Sin [b [w] ] } + 2 * (0.5-p) * (1/ Sqrt [1 + ( (r / R) Cot [b[w] ]) A 2] ) * 
{1, (r/R) Cot [b [w] ] }} , {p, 0, 1}, {a, -Pi/2, Pi/2}] 
w = - 0.25 Pi 

ParametricPlot[{r * {Cos[a]. Sin[a]}, {-R* Cos[a], r * Sin[a]}, 
p * r * {Cos [w] , Sin[w]}+ (1-p) * {-R* Cos [b [w] ] , r * Sin [b [w] ] } , 
r * {Cos [w] , Sin [w] } + 2 * (0.5-p) * (1/ Sqrt[l + (Cot [w] ) A2]) * {1, -Cot [w] } , 

{-R* Cos [b[w] ] , r * Sin [b [w] ]} + 2* (0.5-p) * (1/ Sqrt [1 + ( (r / R) Cot [b[w] ]) A 2] ) * 
{1, (r/R) Cot [b [w] ] }} , {p, 0, 1}, {a, -Pi/2, Pi/2}] 
w = - 0.15 Pi 

ParametricPlot[{r * {Cos[a]. Sin[a]}, {-R * Cos[a], r * Sin[a]}, 
p * r * {Cos [w] , Sin[w]} + (1-p) * {-R* Cos [b [w] ] , r * Sin[b [w] ] } , 

r * {Cos [w] , Sin [w] } + 2 * (0.5-p) * (1/ Sqrt[l + (Cot [w] ) A 2]) * {1, -Cot [w] } , 

{-R* Cos [b[w] ] , r * Sin [b [w] ]} + 2* (0.5-p) * (1/ Sqrt [1 + ( (r / R) Cot [b[w] ]) A 2] ) * 
{1, (r/R) Cot [b [w] ] }} , {p, 0, 1}, {a, -Pi/2, Pi/2}] 
w = - 0.05 Pi 

ParametricPlot[{r * {Cos[a]. Sin[a]}, {-R* Cos[a], r * Sin[a]}, 
p * r * {Cos [w] , Sin[w]}+ (1-p) * {-R* Cos [b [w] ] , r * Sin [b [w] ] } , 

r * {Cos [w] , Sin [w] } + 2 * (0.5-p) * (1/ Sqrt[l + (Cot [w] ) A 2]) * {1, -Cot [w] } , 

{-R* Cos [b[w] ] , r * Sin [b [w] ] } + 2 * (0.5-p) * (1/ Sqrt [1 + ( (r / R) Cot [b[w] ]) A 2] ) * 
{1, (r/R) Cot [b [w] ] }} , {p, 0, 1}, {a, -Pi/2, Pi/2}] 
w = 0.05 Pi 

ParametricPlot[{r * {Cos[a]. Sin[a]}, {-R * Cos[a], r * Sin[a]}, 
p * r * {Cos [w] , Sin[w]} + (1-p) * {-R* Cos [b [w] ] , r * Sin[b [w] ] } , 

r * {Cos [w] , Sin [w] } + 2 * (0.5-p) * (1/ Sqrt [1 + (Cot [w] ) A 2]) * {1, -Cot [w] } , 

{-R* Cos [b[w] ] , r * Sin [b [w] ]} + 2* (0.5-p) * (1/ Sqrt [1 + ( (r / R) Cot [b[w] ]) A 2] ) * 
{1, (r/R) Cot [b [w] ] }} , {p, 0, 1}, {a, -Pi/2, Pi/2}] 
w = 0.15 Pi 

ParametricPlot[{r * {Cos[a]. Sin[a]}, {-R* Cos[a], r * Sin[a]}, 
p * r * {Cos [w], Sin[w]}+ (1-p) * {-R* Cos [b [w] ] , r * Sin [b [w] ] } , 

r * {Cos [w] , Sin [w] }+2* (0.5-p) * (1/ Sqrt[l + (Cot [w] ) A 2]) * {1, -Cot [w] } , 
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{ - R * Cos [b [w] ] , r * Sin [b [w] ]}+2* (0.5-p) * (1/ Sqrt [1 + ( (r / R) Cot [b [w] ]) A 2]) * 
{1, (r/R) Cot [b [w] ] } }, {p, 0, 1}, {a, -Pi/2, Pi/2}] 
w = 0.25 Pi 

ParametricPlot [ {r * {Cos [a]. Sin [a]}, {-R* Cos [a], r*Sin[a]}, 
p * r * {Cos [w] , Sin [w] } + (1 - p) * {-R* Cos [b [w] ] , r * Sin [b [w] ] } , 
r * {Cos [w] , Sin [w] }+2* (0.5-p) * (1/ Sqrt [1 + (Cot[w] ) A 2]) *{1, - Cot [w] } , 

{ - R * Cos [b [w] ] , r * Sin [b [w] ]}+2* (0.5-p) * (1/ Sqrt [1 + ( (r / R) Cot [b [w] ]) A 2]) * 
{1, (r/R) Cot [b [w] ] } }, {p, 0, 1}, {a, -Pi/2, Pi/2}] 
w = 0.35 Pi 

ParametricPlot [ {r * {Cos [a]. Sin [a]}, {-R* Cos [a], r*Sin[a]}, 
p * r * {Cos [w] , Sin [w] } + (1 - p) * {-R* Cos [b [w] ] , r * Sin [b [w] ] } , 
r * {Cos [w] , Sin [w] }+2* (0.5-p) * (1/ Sqrt [1+ (Cot[w] ) A 2]) *{1, - Cot [w] } , 

{ - R * Cos [b [w] ] , r * Sin [b [w] ]}+2* (0.5-p) * (1/ Sqrt [1 + ( (r / R) Cot [b [w] ] ) A 2] ) * 
{1, (r/R) Cot [b [w] ] } }, {p, 0, 1}, {a, -Pi/2, Pi/2}] 
w = 0.45 Pi 

ParametricPlot [ {r * {Cos [a]. Sin [a]}, {-R* Cos [a], r*Sin[a]}, 
p * r * {Cos [w] , Sin [w] } + (1 - p) * {-R* Cos [b [w] ] , r * Sin [b [w] ] } , 
r * {Cos [w] , Sin [w] }+2* (0.5-p) * (1/ Sqrt [1 + (Cot[w]) A 2]) *{1, - Cot [w] } , 

{ - R * Cos [b [w] ] , r * Sin [b [w] ]}+2* (0.5-p) * (1/ Sqrt [1 + ( (r / R) Cot [b [w] ] ) A 2] ) * 
{1, (r/R) Cot [b [w] ] } }, {p, 0, 1}, {a, -Pi/2, Pi/2}] 
w = 0.49 Pi 

ParametricPlot [ {r * {Cos [a]. Sin [a]}, {-R* Cos [a], r*Sin[a]}, 
p * r * {Cos [w] , Sin [w] } + (1 - p) * {-R* Cos [b [w] ] , r * Sin [b [w] ] } , 
r * {Cos [w] , Sin [w] }+2* (0.5-p) * (1/ Sqrt [1 + (Cot[w]) A 2]) *{1, - Cot [w] } , 

{ - R * Cos [b [w] ] , r * Sin [b [w] ]}+2* (0.5-p) * (1/ Sqrt [1 + ( (r / R) Cot [b [w] ] ) A 2] ) * 
{1, (r/R) Cot [b [w] ] } }, {p, 0, 1}, {a, -Pi/2, Pi/2}] 
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